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ABSTRACT 



In 1990, the National Assessment of Educational Progress 
(NAEP) included a Trial State Assessment (TSA) ; for the first time in the 
NAEP's history, voluntary state-by- state assessments were made. The sample 
was designed to represent the 8th grade public school population in a state 
or territory. In 1996, 44 states, the District of Columbia, Guam, and the 
Department of Defense schools, took part in the NAEP state science assessment 
program. The NAEP 1996 state science assessment was at grade 8 only, although 
grades 4, 8, and 12 were assessed at the national level as usual. The 1996 
state science assessment covered three major fields: earth, physical, and 
life sciences. In Montana, 2,029 public school and 154 nonpublic school 
students in 79 public schools and 13 nonpublic schools were assessed. This 
report describes the science proficiency of Montana eighth-graders, compares 
their overall performance to students in the West region of the United States 
and the entire United States (using data from the NAEP national assessment) , 
presents the average proficiency for the three major fields, and summarizes 
the performance of subpopulations (gender, race/ethnicity, parents' 
educational level. Title I participation, and free/reduced lunch program 
eligibility) . To provide a context for the assessment data, participating 
students, their science teachers, and principals completed questionnaires 
which focused on: instructional content (curriculum coverage, amount of 
homework) ; delivery of science instruction (availability of resources, type) ; 
use of computers in science instruction; educational background of teachers; 
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and conditions facilitating science learning (e.g., hours of television 
watched, absenteeism) . On the NAEP fields of science scales that range from 0 
to 300, Montana students had an average proficiency of 162 compared to 148 
throughout the United States . The average science scale score of males did 
not differ significantly from that of females in either Montana or the 
nation. At the eighth grade. White students in Montana had an average science 
scale score that was higher than those of Hispanic and American Indian 
students. In Montana at grade 8, the average scale score of public school 
students (162) was not significantly different from that of nonpublic school 
students (158) . (DDR/NB) 
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HIGHLIGHTS 



^^onitoring the peiformance of students in subjects such as science is a key 
concern of the citizens, policy makers, and educators who direct educational reform 
efforts. The 1996 National Assessment of Educational Progress (NAEP) in science 
assesses the current level of science performance as a mechanism for informing 
education reform. This science assessment is the first to be constmcted on a new 
framework, and it is also the first to be given at the state level. This report contains 
results for public and nonpublic school students at grade 8. 

What Is NAEP? 

The National Assessment of Educational Progress (NAEP), the “Nation’s Report 
Card,” is the only ongoing nationally representative assessment of what America’s 
students know and can do in various academic subjects. Since 1969, NAEP assessments 
have been conducted with national samples of students in the areas of reading, 
mathematics, science, \viiting, and other fields. By making information on student 
performance available to policy makers, educators, and the general public, NAEP is an 
integral part of our nation’s evaluation of the conditions and progress of education. 

NAEP is a congressionally mandated project of the National Center for Education 
Statistics, U.S. Department of Education. Results are provided only for group 
performance. NAEP is forbidden by law to report results at an individual or school 
level. 

In 1990 Congress authorized a voluntary state-by-state NAEP assessment. The 
1990 Trial State Assessment in mathematics at grade 8 was the first state-level NAEP 
assessment. Since then, state-level assessments have taken place in 1992 and 1994 in 
reading (grade 4), in 1992 and 1996 in mathematics (grades 4 and 8), and in 1996 in 
science (grade 8). In 1996, 44 states, the District of Columbia, Guam, and the 
Department of Defense Schools took part in the NAEP state assessment program. The 
NAEP 1996 state science assessment was at grade 8 only, although grades 4, 8, and 12 
were assessed at the national level as usual. 
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NAEP 1996 Science Assessment 

The NAEP 1996 science assessment was developed using a new framework. This 
framework was produced by educators, administrators, assessment experts, and 
curriculum specialists using a national consensus process. The framework was designed 
to reflect current practices in science teaching. It called for the use of multiple-choice 
questions and constmcted-response questions that required both short and extended 
responses. The constmcted-response questions served as indicators of students’ ability 
to know and integrate facts and scientific concepts, their ability to reason, and their 
ability to communicate scientific information. In the 1996 assessment, these 
constmcted-response questions constituted nearly 80 percent of the total student response 
time. The NAEP 1996 assessment in science also included hands-on tasks that enabled 
students to demonstrate directly their knowledge and skills related to scientific 
investigation. 

The 1996 science framework was stmctured according to a matrix that consisted 
of the three traditional fields of science (earth, physical, and life) crossed with three 
processes of knowing and doing science (conceptual understanding, scientific 
investigation, and practical reasoning). A central category encompassing the nature of 
science and the nature of technology was woven throughout the assessment, as was a 
themes category representing major ideas or key concepts that transcend scientific 
disciplines.' 

Students’ science performance is summarized on the NAEP science scales, which 
range fitsm 0 to 300 at each grade. While the scale score ranges are identical for grades 
4, 8, and 12, the scales were derived independently at each grade. For example, scale 
scores on the grade 8 scale cannot imply anything about performance at grade 12 in the 
national assessment. The science scale is discussed in Appendix C of this report, the 
NAEP 1996 Science State Report for Montana (see C.9). Note that the national average 
for the combined public and nonpublic school population is 150; the average for public 
schools only (appropriate for most tables in this report) is 148. 

Comparison of Montana to the Nation 

Table H.l shows the distribution of science scale scores for eighth-grade students 
attending public schools in Montana, the West region, and the nation in 1996. See 
Chapter 2 (Table 2.6) of this report for the results for the nonpublic and the combined 
public and nonpublic school populations. 

• The average science scale score for eighth graders in public schools in 
Montana was 162. This average was higher than that for public school 
students across the nation (148).* 



More details about the NAEP 1996 science assessment can be found in Appendix B of this repoit, the NAEP 1996 
Science State Report for Montana. 

2 

Differences reported as significant arc statistically different at the 95 percent confidence level. This means that with 
95 percent confidence there is a real difference in the average science scale score between the two populations of 
interest 

9 
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THE N/mON’S 

19981^^ 
State Assessment 


TABLE H.1 


Distribution of Science Scale Scores for Public School 
Students at Grade 8 



Average 


10th 


25th 


50th 


75th 


90th 


Scale Score 


Percentile 


Percentile 


Percentile 


Percentile 


Percentile 



Montana 


162 ( 1.2) 


127 ( 2.6) 


146 ( 1.7) 


164 (12) 


180 ( 0.6) 


194 ( 1.9) 


West 


148 ( 2.2) 


101 ( 3.3) 


127 ( 3.1) 


151 (2.0) 


172 ( 1.7) 


190(3.7) 


Nation 


148 ( 0.9) 


102 ( 1.6) 


126 ( 1.3) 


151 ( 0.9) 


172 ( 1.1) 


191 ( 1.3) 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



Major Findings for Student Subpopulations 

The preceding section provided a view of the overall science performance of. 
eighth-grade students in Montana. It is also important to examine the average science 
scale scores of subgroups within the population. T)^ically, NAEP presents results for 
demographic subgroups defined by gender, race/ethnicity, and parental education. In 
addition, in 1996 NAEP collected information on student participation in two federally 
funded programs: Title I programs and the ffee/reduced-price lunch component of the 
National School Lunch Program. The 1996 state assessment in science also continues 
a component first introduced with the NAEP 1994 state assessment in reading — 
assessment of a representative sample of nonpublic school students. 

The reader is cautioned against using NAEP results to make simple or causal 
inferences related to subgroup membership. Differences among groups of students are 
almost certainly associated with a broad range of socioeconomic and educational factors 
not discussed in NAEP reports and possibly not addressed by the NAEP assessment 
program. 
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Results related to gender and race/ethnicity for public school students are 
highlighted below. A comparison of public and nonpublic school results is also 
presented. More complete results for the various demographic subgroups examined by 
the NAEP science assessment can be found in Chapter 2 of this report, the NAEP 1996 
Science State Report for Montana. 

• The average science scale score of males did not differ significantly 
from that of females in either Montana or the nation. 

• At the eighth grade. White students in Montana had an average science 
scale score that was higher than those of Hispanic and American Indian 
students. 

• In Montana at grade 8, the average scale score of public school students 
(162) was not significantly different from that of nonpublic school 
students (158). 

Finding a Context for Understanding Students’ Science Performance 
in Public Schools 

The science performance of students in Montana may be better understood when 
viewed in the context of the environment in which students are learning. This 
educational environment is largely determined by school policies and practices, by 
characteristics of science instruction in the school, by home support for academics and 
other home influences, and by students’ own views about science. Information about 
this environment is gathered by means of questionnaires completed by principals and 
teachers as well as questions answered by students as part of the assessment. 

Because NAEP is administered to a sample of students that is representative of 
all eighth-grade students in Montana schools, NAEP results provide a view of the 
educational practices in Montana that may be useful for improving instruction and 
setting policy. However, despite the richness of context provided by the NAEP results, 
it is very important to note that NAEP data cannot establish a cause-and-effect 
relationship between educational environment and students’ scores on the NAEP science 
assessment. 
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The following results are for public school students: 

School Science Education Policies and Practices ^ 

• In Montana, the percentage of eighth-grade students attending public 
schools that reported science was a priority (25 percent) was smaller 
than the percentage of eighth-grade students nationwide (43 percent). 

• The percentage of eighth-grade students in Montana who attended 
schools that were expected to follow a district or state curriculum 
(87 percent) was not significantly different from* the national 
percentage (94 percent). 

• Relatively few of the students in Montana had teachers who reported 
receiving all of the resources they needed for classroom instruction 
(10 percent). This was not significantly different from the 
corresponding percentage of eighth-grade students nationwide 
(11 percent). 

• In Montana, 27 percent of the eighth-grade students were taught by 
teachers who reported that there was a curriculum specialist available 
to help or advise them in science. This figure was smaller than that of 
students across the nation (43 percent). 

Science Classroom Practices' * 

• Less than one fifth of the eighth-grade students in Montana had science 
teachers who reported spending a lot of time on earth science 
(16 percent), a large majority reported spending a lot of time on 
physical science (81 percent), and relatively few reported spending a lot 
of time on life science (13 percent). 

• Less than one fifth of the students in Montana (14 percent) had teachers 
who reported they plaimed to place moderate emphasis on the 
understanding of key science concepts by their students. This 
percentage was smaller than that of students whose teachers planned 
heavy emphasis on conceptual understanding (86 percent). 



* Although the difference may appear large, recall that “significance” here refers to “statistical significance.” 

^ More detailed results related to school policies and practices can be found in Chapter 3 of this report, the NAEP 1996 
Science State Report for Montana. 

4 

More detailed results related to classroom practices can be found in Chapter 4 of this report, the NAEP 1996 Science 
State Report for Montana. 
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• In Montana, the percentage of eighth-grade students whose teachers 
reported they plaimed to give moderate emphasis to developing science 
problem-solving skills (31 percent) was smaller than that of students 
whose teachers planned heavy emphasis on this topic (69 percent). 

• Teachers of 57 percent of the students in Montana reported that they 
planned to place moderate emphasis on knowing how to communicate 
ideas in science effectively, greater than the percentage of smdents 
whose teachers reported giving this topic heavy emphasis (35 percent). 

• In Montana, 19 percent of eighth graders reported not spending any time 
on science homework in a typical week while 39 percent spent one hour 
or more on their science homework each week. 

Scientific Investigations ^ 

• Of the eighth-grade students in Montana, 91 percent had teachers who 
reported giving moderate to heavy emphasis on the development of data 
analysis skills. This percentage was not significantly different from that 
of students nationwide (89 percent). 

• A large majority of the eighth graders in Montana had teachers who 
reported their students performed hands-on activities or investigations 
in science once a week or more (80 percent). 

Influences Bevond School That Facilitate Learning Science^ 

• The percentage of eighth graders in Montana who reported watching six 
or more hours of television a day (9 percent) was smaller than the 
percentage for the nation (17 percent). 

• In Montana, 43 percent of eighth graders agreed that science is useful 
for solving everyday problems. 



^ More detailed results related to scienti 5c investigatioiis can be found in Ch^ter 5 of this report, the NAEP 1 996 Science 
State Report for Montana. 

^ More detailed results related to influences beyonds school that facilitate learning science can be found in Ch^ter 6 of 
this report, the NAEP 1996 Science State Report for Montana. 
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INTRODUCTION 



Xmproving education is often seen as an important first step as the United States 
attempts to remain competitive in an increasingly technical global economy. At the 1996 
Governors’ Summit in Palisades, New Jersey, the President and the Governors 
reaffirmed the need to strengthen our schools and strive for world-class standards. 
Furthermore, in his 1997 State of the Union Address, President Clinton placed education 
center stage and called for states to commit to national standards that represent what all 
students must know to succeed in the knowledge-based economy of the twenty-first 
century. 

In 1983, the National Commission on Excellence in Education issued a report 
entitled A Nation at Risk: The Imperative for Educational Reform that was critical of 
education in the United States.’ Interest in reform was also fueled by the publication 
of other reports and analyses that pointed out the deficiencies of the educational system 
and noted how these could be rectified.* Since then, organizations from the public and 
private sectors have assumed pivotal roles in providing support to state and local 
educational establishments as they seek to reform their educational systems in areas such 
as the development of standards, revision of curricula, development of appropriate 
assessment techniques, and professional development.® In addition to these activities, 
organizations such as the National Science Teachers Association and the American 
Association for the Advancement of Science have worked closely with the National 
Research Coimcil to produce dociunents that help teachers interpret the National Science 
Education Standards that were published in 1995.’“ As the new century approaches, 
commitment to science reform continues. 



^ A Nation at Risk: The Imperative for Educational R^orm. (Washington, DC: National Commission on Excellence in 
Education, 1983). 

g 

Educating Americans for the 21st Century: A Report to the American People and the National Science Board. 
(Washington, DC: National Science Board, Commission on Precollege Education in Mathematics, Science, and 
Technology, 1983). 

9 

Statewide Systemic Initiatives in Science, Mathematics, and Engineering. (Arlington, VA: The National Science 
Foimdation, 1995-1996); Scope, Sequence, and Coordination of Secondary School Science. Volume I: The Content Core; 
Volume II: Relevant Research. (Washington, DC: National Science Teachers Association, 1992); Benchmarks for 
Science Literacy. (Washington, DC: Project 2061, American Association for the Advancement of Science, 1993); New 
Standards Project. (Washington, DC: National Research Council, 1995). 

National Science Education Standards. (Washington, DC: National Research Council, 1996). 
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Monitoring the performance of students in science is a key concern of the state 
and national policy makers and educators who direct educational reform efforts. To this 
end, the 1996 National Assessment of Educational Progress (NAEP) is an important 
source of information on what the nation’s students know and can do in science. 

What Was Assessed? 

The science assessment was crafted to measure the content and skills specified in 
the science framework for the 1996 NAEP. Two organizing concepts underlie the 
science framework. First, scientific knowledge should be structured so as to make 
factual information meaningful. The way in which knowledge is structured should be 
influenced by the context in which the knowledge is being presented. Second, science 
performance depends on knowledge of facts, the ability to integrate this knowledge into 
larger constructs, and the capacity to use the tools, procedures, and reasoning processes 
of science to develop an increased understanding of the natural world. Thus, the 
framework called for the NAEP 1996 science assessment to include the following: 

• Multiple-choice questions that assess students’ knowledge of important 
facts and concepts and that probe their analytical reasoning skills; 

• Constracted-response questions that explore students’ abilities to 
explain, integrate, apply, reason about, plan, design, evaluate, and 
communicate scientific information; and 

• Hands-on tasks that probe students’ abilities to use materials to make 
observations, perform investigations, evaluate experimental results, and 
apply problem-solving skills. 

The core of the science framework is organized along two dimensions. The first 
dimension divides science into three major fields: earth, physical, and life sciences. 

The second dimension defines characteristic elements of knowing and doing science: 
conceptual understanding, scientific investigation, and practical reasoning. Each 
question in the assessment is categorized as measuring one of the elements of knowing 
and doing within one of the fields of science (e.g., scientific investigation in the context 
of earth science). The framework also contains two overarching domains — the nature 
of science and the organizing themes of science. The nature of science encompasses the 
historical development of science and technology, the habits of mind that characterize 
science, and the methods of inquiry and problem solving. It also includes the nature 
of technology — specifically, design issues involving the application of science to 
real-world problems and associated trade-offs or compromises. The themes of science 
include the notions of systems and their application in the scientific disciplines, models 
and their functioning in the development of scientific understonding, and patterns of 
change as they are exemplified in natural phenomena. A fuller description of the 
framework is provided in Appendix B. 
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Who Was Assessed? 

School and Student Characteristics 

Table 1.1 provides demographic profiles of the eighth-grade students in Montana, 
the West region, and the nation. These profiles are based on data collected from the 
students and schools participating in the 1996 state and national science assessments at 
grade 8. As described in Appendix A, the state data and the regional and national data 
are drawn from separate samples. 

In 1996, approximately 95 percent of eighth graders in Montana attended public 
schools, with the remaining students attending nonpublic schools (including Catholic and 
other private schools). For the nation, 89 percent of students at grade 8 attended public 
schools in 1996. 

To ensure comparability across jurisdictions, NCES has established guidelines for 
school and student participation rates. Appendix A highlights these guidelines, and 
jurisdictions failing to meet these guidelines are noted in tables and figures in NAEP 
reports containing state-by-state results. For jurisdictions failing to meet the ini tial 
school participation rate of 70 percent, results are not reported. 

Schools and Students Assessed 

Table 1.2 summarizes participation data for schools and students sampled in 
Montana for the 1996 state assessment program in science." 

In Montana, 79 public schools and 13 nonpublic schools participated in the 1996 
eighth-grade science assessment. These numbers include participating substitute schools 
that were selected to replace some of the nonparticipating schools from the original 
sample. The weighted school participation rate after substitution in 1996 was 76 percent 
for public schools and 97 percent for nonpublic schools, which means that the 
eighth-grade students in this sample were directly representative of 76 percent and 
97 percent of all the eighth-grade public and nonpublic school students, respectively, in 
Montana. 

In each school, a random sample of students was selected to participate in the 
assessment. In Montana in 1996, on the basis of sample estimates, 0 percent of the 
eighth-grade public school population and 12 percent of the nonpublic school population 
were classified as students with limited English proficiency (LEP). In addition, 

9 percent of eighth graders in public schools and 1 percent of eighth graders in 
nonpublic schools had an Individual Education Plan (lEP). An lEP is a plan written for 
a student who has been determined to be eligible for special education. The lEP 
typictilly sets forth goals and objectives for the student and describes a program of 
activities and/or related services necessary to achieve the goals and objectives. 



For a detailed discussion of the NCES guidelines for sample participation, see Appendix A of this report or the 
Techrucal Report of the NAEP 1996 State Assessment Program in Science. (Washington, DC: Nationd Center for 
Education Statistics, 1997). 
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TABLE 1.1 


Profile of Students 
Nation at Grade 8 


in Montana, the West Region, and the 





Demographic Subgroups 


Public 


Nonpublic 


Combined 


Percentage 











RACE/ETHNICITY 

Montana 


White 


83 ( 1.9) 


76 ( 9.5) 


83 ( 1.9) 




Black 


1 ( 0.1) 


3 (— •) 


1 (0.2) 




Hispanic 


5 ( 0.5) 


8 ( 3.3) 


5 ( 0.5) 




Asian/Pacific Islander 


1 (0.2) 


0 (****) 


1 (02) 




American Indian 


10 ( 1.7) 


13 ( 6.0) 


10(1.6) 


West 


White 


65 ( 2.9) 


67 ( 7.8) 


65 ( 2.8) 




Black 


6 ( 1.3) 


5(2.1) 


6(12) 




Hispanic 


21 ( 2.3) 


19 (6.1) 


21 ( 22) 




Asian/Pacific Islander 


4 ( 0.8) 


8(1.7) 


5 ( 0.7) 




American Indian 


2 ( 0.9) 


0 (****) 


2 ( 0.8) 


Nation 


White 


68 ( 0.4) 


80 ( 2.7) 


70 ( 02) 




Black 


15 ( 0.3) 


7(1.5) 


14 ( 0.1) 




Hispanic 


12 ( 0.3) 


9(2.1) 


12 ( 02) 




Asian/Pacific Islander 


2 ( 0.3) 


4 ( 0.8) 


3 ( 0.3) 




American Indian 


2 ( 0.3) 


1 (0.2) 


2(02) 


PAPEfnS* EDUCATION 
Montana 


Did not finish high school 


5 ( 0.5) 


7 ( 3.4) 


5 ( 0.5) 




Graduated from high school 


19(1.4) 


9 ( 2.9) 


18(1.3) 




Some education after high school 


22 ( 0.8) 


23 ( 5.8) 


22 ( 0.9) 




Graduated from college 


48 ( 1.4) 


48 (10.6) 


48 ( 1.5) 




I doni know. 


6 ( 0.6) 


14 ( 9.2) 


6 ( 0.8) 


West 


Did not finish high school 


7 ( 0.9) 


2 ( 0.8) 


7 ( 0.8) 




Graduated from high school 


18 ( 0.9) 


6(2.2) 


17 ( 0.8) 




Some education after high school 


21 ( 1.6) 


12 ( 2.4) 


20 ( 1.5) 




Graduated from college 


44 ( 2.3) 


73 ( 5.5) 


46 ( 2.3) 




1 doni know. 


10 ( 0.9) 


7(2.5) 


10 ( 0.8) 


Nation 


Did not finish high school 


7 ( 0.5) 


2 ( 0.3) 


6 ( 0.4) 




Graduated from high school 


21 ( 1.0) 


10(1.1) 


20 ( 0.9) 




Some education after high school 


20 ( 0.7) 


17(1.8) 


20 ( 0.7) 




Graduated from college 


42 ( 1.3) 


66 ( 3.0) 


45(12) 




1 doni know. 


10 ( 0.6) 


6(1.0) 


9 ( 0.5) 


GENDER 

Montana 


Male 


49 ( 1.5) 


49 ( 4.4) 


49 ( 1.5) 




Female 


51 ( 1.5) 


51 ( 4.4) 


51 ( 1.5) 


West 


Male 


51 ( 1.7) 


51 ( 3.1) 


51 ( 1.6) 




Female 


49(1.7) 


49 ( 3.1) 


49 ( 1.6) 


Nation 


Male 


51 (1.2) 


51 ( 1.8) 


51 (1.0) 




Female 


49 ( 1.2) 


49 ( 1.8) 


49 ( 1.0) 



(continued on next page) 
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TABLE 1.1 (continued) 


IXBBli 

-d 


Profile of Students in Montana, the West Region, and the 
Nation at Grade 8 


State Assessment 





Demographic Subgroups 


Public 


Nonpublic 


Combined 


Percentage 











TTTLEi 










Montana 


Participated 


9(1.1) 


0(—) 


8(1.1) 




Did not participate 


91 (1.1) 


100 (--) 


92 ( 1.1) 


West 


Participated 


15 ( 4.3) 


2 (****) 


14(4.1) 




Did not participate 


85 ( 4.3) 


98 (****) 


86(4.1) 


Nation 


Participated 


13(2.3) 


7(3.6) 


12(2.1) 




Did not participate 


87 ( 2.3) 


93 ( 3.6) 


88(2.1) 


FREE/REDUCED^PRiCE LUNCH 








Montana 


Eligible 


25 ( 1.8) 


7 ( 4.4) 


24 ( 1.8) 




Not eligible 


60 ( 2.8) 


79 (10.2) 


61 ( 2.8) 




Information not available 


16 ( 2.8) 


14 ( 8.2) 


16 ( 2.7) 


West 


Bigible 


25(3.1) 


3(1.8) 


23 ( 3.0) 




Not eligible 


47 ( 6.9) 


34 (13.7) 


47 ( 6.6) 




Information not available 


28 ( 8.6) 


64 (14.3) 


30 ( 8.3) 


Nation 


Eligible 


29 ( 1.6) 


7 ( 3.4) 


26 ( 1.5) 




Not eligible 


51 ( 3.6) 


49 (7.7) 


51 ( 3.3) 




Information not available 


20 ( 4.4) 


44(8.2) 


23(4.1) 



The standard errors of the statistics appear in parentheses. It can be said with about 95 percent confidence for each 
population of interest, the value for the entire population is within ± 2 standard errors of the estimate for the sample. In 
comparing two estimates, one must use the standard error of the difference (see Appendix A for details). The percentages 
for Race/Ethnicity may not add to 100 percent because some students categorized themselves as '*Other.** **** Standard 
error estimates cannot be accurately determined. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



Schools were permitted to exclude certain students from the assessment, provided 
that the following criteria were met To be excluded, a student had to be categorized 
as LEP or had to have an lEP and (in either case) be judged incapable of participating 
in the assessment. The intent was to assess all selected students; therefore, all selected 
students who were capable of participating in the assessment should have been assessed. 
However, schools were allowed to exclude those students who, in the judgment of school 
staff, could not meaningfully participate. The NAEP guidelines for inclusion are 
intended to assiu% uniformity of inclusion criteria from school to school. Note that some 
students classified as LEP and some students having an lEP were deemed eligible to 
participate and were included in the assessment. In Montana, the students who were 
excluded from the assessment because they were categorized as LEP or had an lEP 
represented 3 percent of the public school population and 1 percent of the nonpublic 
school population in grade 8. 
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In Montana, 2,029 public school and 154 nonpublic school eighth-grade students 
were assessed in 1996. The weighted student participation rate was 92 percent for 
public schools and 93 percent for nonpublic schools. This means that the sample of 
eighth-grade students who took part in the assessment was directly representative of 
92 percent of the eligible public school student population and 93 percent of the eligible 
nonpublic school population in participating schools in Montana (that is, all students 
from the population represented by the participating schools, minus those students 
excluded from the assessment). The overall weighted response rate (school rate times 
student rate) was 70 percent and 90 percent for public and nonpublic schools, 
respectively. This means that the sample of students who participated in the assessment 
was directly representative of 70 percent of the eligible eighth-grade public school 
population and 90 percent of the eligible eighth-grade nonpublic school population in 
Montana. 

In accordance with standard practice in survey research, the results presented in 
this report were based on calculations that incorporate adjustments for the 
nonparticipating schools and students. Hence, the final results derived from the sample 
provide estimates of the science performance for the full population of eligible public 
and nonpublic school eighth-grade students in Montana. However, in instances where 
nonparticipation rates are large, these nonparticipation adjustments may not adequately 
compensate for the missing sample schools and students. 

In order to guard against potential nonparticipation bias in published results, the 
National Center for Education Statistics (NCES) has established minimu m participation 
levels as a condition for the publication of 1996 state assessment program results. NCES 
also established additional guidelines addressing foiu: ways in which nonparticipation 
bias could be introduced into a jurisdiction’s published results (see Appendix A). In 
1996 Montana met minimiun participation levels for both public and nonpublic schools 
at grade 8. Hence, results for both types of schools are included in this report. Montana 
met all other established NCES participation guidelines for nonpublic schools but failed 
to meet one or more of these guidelines for public schools. The public school weighted 
participation rate for the initial sample of schools was below 85% AND the weighted 
school participation rate after substitution was below 90% (see Appendix A). 

In the analysis of student data and reporting of results, nonresponse weighting 
adjustments have been made at both the school and student level, with the aim of making 
the sample of participating students as representative as possible of the entire eligible 
eighth-grade population. For details of the nonresponse weighting adjustment 
procedures, see the Technical Report of the NAEP 1996 State Assessment Program in 
Science. 
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TABLE 1.2 


School and Student Participation at Grade 8 in Montana 


State Assessment 





Public 


Nonpubiic 


SCHOOL PARnCIPATION 






Weighted school participation rate before substitution 


70% 


93% 


Weighted school participation rate after substitution 


76% 


97% 


Number of schools originally sampled 


113 


20 


Number of schools not eligible 


5 


5 


Number of schools in original sample participating 


68 


12 


Number of substitute schools provided 


28 


3 


Number of substitute schools participating 


11 


1 


Total number of participating schools 


79 


13 


STUDENT PARTICiPA TION 






Weighted student participation rate after makeups 


92% 


93% 


Number of students selected to participate in the 






assessment 


2,339 


167 


Number of students withdrawn from the assessment 


86 


3 


Percentage of students who were of Limited English 






Proficiency 


0% 


12% 


Percentage of students excluded from the assessment 






due to Limited English Proficiency 


0% 


0% 


Percentage of students who had an Individualized 






Education Plan 


9% 


1% 


Percentage of students excluded from the assessment 






due to Individualized Education Plan status 


3% 


1% 


Number of students to be assessed 


2,206 


163 


Number of students assessed 


2,029 


154 


Overall weighted response rate 


70% 


90% 



Montana’s public school weighted participation rate for the initial sazx^le of schools was below 85% AND the weighted 
school participation rate after substitution was below 90%. See >^pendix A for details. 

SOURCE: National Center for Education Statistics^ National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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Reporting NAEP Science Resuits 

The NAEP Science Scale 

The NAEP 1996 science assessment spans the broad field of science in each of 
the grades assessed. Because of the survey nature of the assessment and the breadth 
of the domain, each student participating cannot be expected to answer all the questions 
in the assessment since this would impose an unreasonable burden on students and their 
schools. Thus, each student was administered a portion of the assessment, and data were 
combined across students to report on the achievement of eighth graders and on the 
achievement of subgroups of students (e.g., subgroups defined by gender or parental 
education). 

Student responses to the assessment questions were analyzed to determine the 
percentage of students responding correctly to each multiple-choice question and the 
percentage of students achieving each of the score categories for constructed-response 
questions. Item response theory (IRT) methods were used to produce scales that 
summarized results for each of the three fields of science (i.e., earth, physical, and life) 
at each grade level. An overall composite scale also was developed at each of grades 
4, 8, and 12 by weighting the separate scales based on the relative importance of each 
field of science in the NAEP science framework. Results presented in this report are 
based on this overall composite scale, which ranges from 0 to 300. 

The use of separate grade-specific reporting scales for the science assessment is 
consistent with the National Assessment Governing Board’s 1993 policy that future 
NAEP assessments be developed using within-grade frameworks and that scaling be 
carried out within grade. Because this science assessment was based on a new 
framework, and no comparisons with previous NAEP science assessments were possible, 
a new scale was developed. The ranges of the science scales (from 0 to 300) differ by 
design from the O-to-500 reporting scales used in other NAEP subject areas and were 
chosen to minimize confusion with other corrunon test scales and to discourage 
inappropriate cross-grade comparisons. 

The national average on the science scale is ISO, including both public and 
nonpublic school students. The average for the nation’s public school students appears 
most frequently in this report, and it is slightly lower. (Additional details of the scaling 
procedures can be found in Appendix C of this report, in the NAEP 1996 Technical 
Report, and in the Technical Report of the NAEP 1996 State Assessment Program in 
Science.) 
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Science Achievement Levels 

A companion report, being issued by the National Assessment Groveming Board, 
will present the NAEP 1996 science results in terms of achievement levels. As 
authorized by the NAEP legislation and adopted by the National Assessment Governing 
Board, the achievement levels are based on the Board’s judgments about what are 
reasonable performance expectations for students on the NAEP 1996 science assessment. 
The achievement levels for the NAEP 1996 science assessment were adopted on an 
interim basis, indicating that they may be revised when other information becomes 
available, such as the fourth- and twelfth-grade results from the Third International 
Mathematics and Science Study (TEMSS). 

Interpreting NAEP Results 

This report describes science performance for eighth graders and compares the 
results for various groups of students within that population — for example, those who 
have certain demographic characteristics or who responded to a specific background 
question in a particular way. The report examines the results for individual demographic 
groups and for individual background questions. It does not include an analysis of the 
relationships among combinations of these subpopulations or background questions. 

Because the percentages of students in these subpopulations and their average 
science scale scores are based on samples, rather than on the entire population of eighth 
graders in a jurisdiction, the numbers reported are necessarily estimates. As such, they 
are subject to a measure of uncertainty, reflected in the standard error of the estimate. 
When the percentages or average scale scores of certain groups are compared, it is 
essential to take the standard error into account, rather than to rely solely on observed 
similarities or differences. Therefore, the comparisons discussed in this report are based 
on statistical tests that consider both the magnitude of the difference between the means 
or percentages and the standard errors of those statistics. 

The statistical tests determine whether the evidence, based on the data from the 
groups in the sample, is strong enough to conclude that the averages or percentages are 
really different for those groups in the population. If the evidence is strong (i.e., the 
difference is statistically significant), the report describes the group averages or 
percentages as being different (e.g., one group performed higher than or lower than 
another group) — regardless of whether the sample averages or sample percentages 
appear to be about the same or not. If the evidence is not sufficiently strong (i.e., the 
difference is not significant), the averages or percentages are described as being not 

c 

significantly dijferent — again, regardless of whether the sample averages or sample 
percentages appear to be about the same or widely discrepant. Rather than relying on 
the apparent magnitude of the difference between sample averages or percentages, the 
reader is cautioned to rely on the results of the statistical tests to determine whether those 
sample differences are likely to represent actual differences between the groups in the 
population. The statistical tests and the Bonferroni procedure, which is used when more 
than two groups are being compared, are discussed in greater detail in Appendix A. 
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In addition, some of the percentages reported in the text of the report are given 
qualitative descriptions (e.g., relatively few, about half, etc.). The descriptive phrases 
used and the rules used to select them are also described in Appendix A. 

The tables in the Highlights and in Part 1 (Chapters 1 and 2) show not only the 
average scale scores for students but also the distribution of their scores at five selected 
percentiles. The distribution of the scores through these percentiles encourages the 
reader to consider the performance of the students in the various groupings (whether by 
state, region, gender, participation in federal programs, etc.) as overlapping ranges of 
heterogeneous performance, rather than as a simple monolithic average. As an example, 
consider Table 2.5 which shows that, for the nation, the 75th percentile for students 
eligible for free or reduced-price lunch is 157 while the average scale score for students 
who were not eligible for this service is 155. This means that at least 25 percent of the 
students eligible for free or reduced-price lunch performed above the average for 
students who were not eligible. 

How is This Report Organized? 

The NAEP 1996 Science State Report for Montana is a computer-generated report 
that describes the science performance of eighth-grade students in Montana, the West 
region, and the nation. The system to generate the state reports was developed because 
reports customized with each jurisdiction’s data would otherwise have been impossible 
to produce in a timely fashion. Because the process is automated, the variables reported 
were chosen as those most likely to be of interest to most jurisdictions. Unfortunately, 
this means that some variables of particular interest may not be reported here; however, 
each jurisdiction wUl receive all reportable data on CD ROM, and all data will be 
available on the NCES Web site (http://www.ed.gov/NCES/naep). Also because of the 
process, the language in the bullets and in parts of the text sometimes seem awkward. 

It is hoped that understanding the reason for these awkwardnesses will enable the reader 
to overlook them. 

A separate report describes additional eighth-grade science assessment results for 
the nation and the states, as well as the national results for grades 4 and 12.‘^ This State 
Report consists of four sections: 

• This Introduction provides background information about what was 
assessed, who was sampled, and how the results are reported. 

• Part One shows the distribution of science scale score results for 
eighth-grade students in Montana, the West region, and the nation. 

• Part Two relates eighth-grade public school students’ science scale 
scores to contextual information about school characteristics, instruction, 
and home support for science in Montana, the West region, and the 
nation. In addition. Chapter 5 discusses student results of the hands-on 
tasks. 



12 

O’Sullivan, C.Y., C.M. Reese, and J. Mazzeo. NAEP 1996 Science Report Card for the Nation and the States. 
(Washington, DC: National Center for Education Statistics, 1997). 
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• Several Appendices are presented to support the results discussed in the 
report: 



Appendix A 
Appendix B 
Appendix C 
Appendix D 



Reporting NAEP 1996 Science Results 
The NAEP 1996 Science Assessment 
Technical Appendix 
Teacher Preparation 



Other Reports of NAEP 1996 Science Results 

Related reports may be of interest to the reader: 

• Cross-State Data Compendium for the 1996 Grade 8 Science Assessment 

• Technical Report of the NAEP 1996 State Assessment Program in 
Science 

• NAEP 1996 Science Report Card for the Nation and the States 

As presently planned, there will be three additional reports appearing in late 1997 
and early 1998. One report will contain sample items and examples of student work 
on these questions. A second report will cover policy and practices in the schools and 
classrooms in the United States. A third report will cover special components of the 
NAEP science assessment, including the advanced science assessment and the hands-on 
exercises. 
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PART ONE 



Science Scale Score Results 

TT he following chapters describe the average science scale scores of eighth-grade 
students in Montana. As described in the Introduction, the NAEP science scale is a 
composite of the three major fields of science: earth, physical, and life. Student 
performance is generally reported on this composite scale and so reflects average student 
scores across the three fields. Student performance may also be summarized on separate 
NAEP fields of science scales that range from 0 to 300. 

This part of the report contains two chapters. Chapter 1 compares the overall 
science performance of public school students in Montana to the nation. (Results for 
the West region are also presented.) It also contains a U.S. map comparing the average 
scale scores in Montana with other states, and a table showing students’ scale score 
distributions for the three fields of science. Chapter 2 summarizes science performance 
for subpopulations of public school students as defined by gender, race/ethnicity, 
parental education, participation in Title I services and programs, and eligibility for the 
free/reduced-price lunch component of the National School Lunch Program (NSLP). 
The second chapter also provides the scale score distributions for nonpublic school 
students, as well as the combined results for students in public and nonpublic schools. 

The NAEP 1996 assessment in science is the first developed using a new 
framework, described in Appendix B. The scale developed to report results from the 
1996 science assessment is a within-grade scale comprised of three fields of science 
scales. Appendix A describes reporting on the scale, and Appendix C describes the 
construction of the scale. 
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Item Maps 

Students’ performance is summarized on the NAEP science scale which ranges 
from 0 to 300. Nationally, public school students’ scale scores ranged from about 102 
for those scoring at the 10th percentile to about 191 for those performing at the 90th 
percentile. Sample questions are shown in Figure 1.1 illustrating the range of 
performance on the NAEP science scale for grade 8. Each question is one that is likely 
to be answered correctly by a student whose score is at or near the given percentile. 

To illustrate the range of performance in more detail, questions from the 
assessment were “mapped” onto a 0 to 300 scale, as in Figure 1 .2. The item map is a 
visual representation of the scale showing selected questions in positions corresponding 
to their difficulty. The item map shows which questions a student of any particular 
ability is likely to answer correctly. The position of the question on the scale represents 
a dividing line. Students who attained scores greater than the score corresponding to the 
question’s difficulty are likely to answer it correctly, while students with scores below 
that degree of difficulty are less likely to answer it correctly. 

More specifically, students who scored below the scale score associated with a 
particular question had less than a 65 percent probability of earning a given amnnnt of 
credit on a constructed-response question or less than a 74 percent probability of 
correctly answering a multiple-choice question. A small proportion of these students 
— those near but below the question’s position on the scale — may be more likely than 
not to answer the question correctly (between 50 and 65 or 74 percent). Such students 
are not considered “able” to answer the question, since they have not achieved sufficient 
consistency in their responses. 

This discussion and the item map illustrations refer to eighth-grade students in the 
national assessment, whose scores may not resemble those of eighth-grade students in 
Montana. 



TOE 

RBVHT 

CARD 



twnows 



rasp 

M: 



FIGURE 1.1 



Sample Questions Likely to Be Answered Correctly 
Grade 8 Students At or Near Selected Percentiles 



by 



Perpentile 



Question 



10th 


Find typical yearly rainfall from a graph. (104) 


25th 


Explain the impact of fish death on an ecosystem, (127) 


50th 


Identify the effect of acid rain. (150) 


75th 


Understand where earthquakes occur. (172) 


90th 


Explain why lightning is seen before thunder is heard. (194) 



The value in parentheses represents the scale score attained by students who had a 65 percent probability of reaching a 
given level on a constructed-response question (in italic type) or a 74 percent probability of correctly answering a 4-option 
multiple-choice question (in regular type). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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Figure 1.2 is an item map for grade 8.*^ Multiple-choice questions are shown in 
regular type; constracted-response questions are in italic type.’^ An example of how to 
interpret the item map may be helpful. In this figure, a multiple-choice question 
involving interpreting a graph maps at the 136 point on the scale. This means that 
eighth-grade students with science scale scores at or above 136 are likely to answer this 
question correctly — that is, they have at least a 74 percent chance of doing so.‘’ Put 
slightly differently, this question is euiswered correctly by at least 74 of every 100 
students scoring at or above the 136 scale-score level. Note that this does not mean that 
students at or above the 136 scale. score always answer the question correctly or that 
students below the 136 scale score always answer it incorrectly. 

As another example, consider the constracted-response question that maps at a 
scale score of 194. This question concerns the differing speeds of light and sound. 
Scoring of responses to this question allows for partial credit by using a three-level 
scoring guide. Mapping a question at the 194 scale score indicates that at least 65 
percent of the students performing at or above this point achieved a score of 3 
(“Complete”) on the question. Among students with lower scores, less , than 65 percent 
gave complete responses to the question. 



Details on the procedures used to develop the item map are provided in the forthcoming NAEP 1996 Technical 
Report. The procedures are similar to those used in past NAEP assessments. 

14 

The placement of constiucted'response questions is based on (1) the **mapping’’ of a score of 3 on a 3*point scoring 
guide for short constructed-response questions and (2) the “mapping” of a score of at least 3 on a 4-point scoring guide 
and a score of at least 4 on a 5-point scoring guide for extended constructed-response questions. 

For constructed-response questions, a criterion of 65 percent was used. For multiple-choice questions, the criterion 
was 74 percent The use of a higher criterion for multiple-choice questions reflected the students’ ability to “guess” 
the correct answer from among the alternatives. 
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Map of Selected Questions on the NAEP Science Scale 


State Assessment 


for Grade 8 



NAEP Scale 

r ; 30Q "n 



Explain cause and prevention of oumbfing of ondent monument (21 3) 
Recognize port of cell that contoins genetic materiol (205) 

hfimdmsBsmqfpearmtsandmtAer 

Understand farms of energy conversion (189) 


192 — — 
(90H> ptctitait) 


M (206) Know which stotement is consistent with theory of evolution 
M {W)Bqddtt why lightning b seen before thunder is heard 


Understand trend of rainfall dota on graph (1 84) 
Identify oreos that hove worm sununers ond cold winten (180) 




M ( 184) Understand nrarfdngs of contour mop to find dbedmn of river flow 
M (182) Understand whkh setup models the water cyde 


Understand where eorthquokes occur (172) 
Know how pitch is related to length (171) ^ 
!ho5UTBpHoffmtypKofso^\\hh) 
DmisdexpennwnttoinvesligatBshttdcmdmgesi^^ 


(7Sth paifioiwitifa) 


^ (1 72) Umforstond what hoppens when a mognet is placed inside 0 cod 
< {HE) Understand nanfement of tnidc in rebtian to an 
^ (163)Understanddirectionofmovementoftercoffision 


Identify effed of odd rain (150) 


_-.153_L 
(SOrii paranrife] 


^ (158) identify source of otmospheric oxygen 
(153)Oossifyorgonismfroffl(horacteristlGS 


Drpw in orison model of solar system i\29) >> 




< ( 148) Identify imperty of water that is most iniportont for orgonisins 


Interpret graph showing seed production and rmnloQ (136) 




< (1 35) Understond effect on density of odihng more soh to solution 


Explain advantage^disadvantages ofpbatmg near a stream (1 24) ^ 
Interpret groph of revohition Venus distonce(12U ^ 


(2Slhpmiifl«) 


M {]2I)Expbiin impact of fishdeadi on eeosystem 
< (121) Identify best experimental setup 


Identify orgonsimportont for oxygen transfer (113) ^ 




< (114) Identify property that results from processes of Bviiq things 

< (1 13) Identify organisms that live in trapicnl rain forest 


Identify or^mism that produces its own food (89) ^ 


(10th pmtmnaui 


M (1()4)RndtypicdyeorlyrainfaQfromgra^ 

M {SS)Detenmne whether nmken an permanent or non-pennanent 


Identify iteaslkatoaduaekc^^ ^ 

3 




> 



NOTE: Position of qoestions is opproximote ond on oppropriote scole ronge is displayed for grode 8. 
Italic type indicates a constrocted-response question. Regolar type denotes a moltiple-cboice qoestion. 



Each grade 8 science question was mapped onto the NAEP 0-to>300 science scale. The position of the question on the scale 
represents the scale score attained by students who had a 65 percent probability of reaching a given score level on a constructed- 
response question or a 74 percent probability of correctly answering a 4-option multiple-choice question. Only selected questions are 
presented. Percentiles of scale score distribution are referenced on the m^. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 19% Science 
Assessment. 
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CHAPTER 1 



Science Scale Score Results for Eighth-Grade 
Students 

To re main competitive in the global economy, a technologically and scientifically 
literate citizenry is required. As a result, reform in science and mathematics education 
in the United States has gained increasing attention. The 1983 publication A Nation At 
Risk: The Imperative for Educational Reform called for overall reform of the United 
States educational system, with heavy emphasis placed on mathematics and science. 
The National Goals Panel was convened in 1989 to further focus attention on education 
reform. In 1991 the National Science Foundation’s Statewide Systemic Initiative began 
awarding grants to support state reform in K-12 mathematics and science 
instmction.” During the 1990s many states have been developing standards for science 
curriculum, teaching, and assessment using guidance from reform efforts such as the 
American Association for the Advancement of Science’s Project 2061, the National 
Science Teachers Association’s Scope, Sequence, and Coordination of High School 
Science, and the recently published National Research Council’s National Science 
Education Standards}^ A reaffirmation of the goal for world-class standards in 
education was made at the 1996 Governors’ Summit in Palisades, NJ. All these efforts 
address ways to produce innovative science curricula aimed at improving national 
scientific literacy. As a means of infor ming the progress of such reform, the U.S. 
Department of Education supports programs geared toward assessing the ciurent level 
of science knowledge and skills including the Third International Mathematics and 
Science Study (TIMSS),'® administered in 1995, and the 1996 National Assessment of 
Educational Progress (NAEP) in science. 



A Nation at Risk: The Imperative for Educational Reform. (Washington, DC: National Commission on Excellence in 
Education, 1983). 

17 

Statewide Systemic Initiative. (Washington, DC: National Science Foundation, 1990). 

18 

Science for All Americans: A Project 2061 Report on Literacy Goals in Science, Mathematics and Technology. 
(Washington, DC: Anoerican Association for the Advancement of Science, 1989); Scope, Sequence, and Coordination 
of High School Science. (Washington, DC: National Science Teachers Association, 1995); National Science Education 
Standards. (Washington DC: National Research Council, 1996). 

19 

The Third International Mathematics and Science Study was conducted in 1994 in the Southern Hemisphere and in 
1995 in the Northern Hemisphere. 
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The NAEP 1996 state science assessment at grade 8 was the first time science has 
been assessed at the state level. It continues the state-level component begun in 1990 
with the NAEP Trial State Assessment (TSA). The NAEP 1996 assessment in science 
had 47 participating jurisdictions. “ Results for 46 jurisdictions were reported for the 
science assessment.*' 

The science fi:amework for the 1996 National Assessment of Educational 
Progress** was developed through a consensus process involving educators, policy 
makers, business people, assessment experts and curriculum specialists. The 1996 
NAEP science assessment included multiple-choice questions, constructed-response 
exercises, and (for the first time) hands-on tasks. Because the 1996 assessment was 
based on an essentially new fi-amework, it is not possible to compare results fix>m the 
1996 assessment with those fix)m the previous NAEP science assessment in 1990. 

Table 1.1 shows the distribution of science scale scores for eighth-grade students 
attending public schools in Montana, the West region, and the nation. 

• The average science scale score for eighth-grade public school students 
in Montana was 162. This average was higher than that for public 
school students across the nation (148).** 
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TABLE 1.1 


Distribution of Science Scale Scores for Public School 
Students 



Average 


10th 


25th 


50th 


75th 


90th 


Scale Score 


Percentile 


Percentile 


Percentile 


Percentile 


Percentile 



Montana 


162 ( 1.2) 


127 ( 2.6) 


146(1.7) 


164(1.2) 


180 ( 0.6) 


194(1.9) 


West 


148 (2.2) 


101 (3.3) 


127(3.1) 


151 (2.0) 


172 ( 1.7) 


190 ( 3.7) 


Nation 


148 ( 0.9) 


102 ( 1.6) 


126(1.3) 


151 (0.9) 


172 (1.1) 


191 (1.3) 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for frie entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



20 

Jurisdiction refers to states, territories, the District of Columbia, and the Department of Defense Education Activities 
(DoDEA) domestic and international schools. The DoDEA schools also madpi special arrangements to assess their 
fourth-grade students in science. 

21 ..... 

One junsdiction did not meet minimum participation levels for public or nonpublic schools and did not have any results 
reported. 

22 

Science Framework for the 1996 National Assessment of Educational Progress. (Washington, DC: National Assessment 
Governing Board, 1993). 

23 

Differences reported as significant are statistically different at the 95 percent confidence level. This means that with 
95 percent confidence there is a real difference in the average science scale score between the two populations of 
interest 
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Comparisons Between Montana and Other Participating 
Jurisdictions 



The map on the following page shows how the average science scale score for 
eighth-grade public school students in Montana compares with those of other 
jurisdictions participating in the NAEP 1996 science assessment. The different shadings 
on the map indicate whether or not the average scale scores of public school students 
in the other jurisdictions were statistically different from that of public school students 
in Montana (‘Target State”). States with horizontal li nes have a significantly lower 
average science scale score than Montana while states with gray shading have a 
significantly higher average scale score. Unshaded states have average scale scores that 
did not differ significantly from the average for Montana. States with large 
crosshatching did not meet minimum participation rate guidelines established by NCES 
for the NAEP assessments. A description of the statistical procedures used to produce 
this map is contained in Appendix A. 
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Performance in the NAEP Fields of Science 

The core of the science framework is organized along two dimensions. The first 
divides science into three major fields: earth, physical, and life. The second dimension 
defines characteristic elements of knowing and doing science: conceptual imderstanding, 
scientific investigation, and practical reasoning. Each question is categorized as 
measuring one of the elements of knowing and doing within one of the fields of science. 

Table 1 .2 shows the distribution of scale scores for each of the three fields of 
science for Montana, the West region, and the nation. Appendix B describes the three 
fields of science in more detail, and Appendix C contains a discussion of the scaling 
procedmes used to develop the three fields of science scales and the composite NAEP 
science scale. 

• Students in Montana performed higher than students nationwide in the 
physical science, earth science, and life science fields described in the 
science framework. 
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TABLE 1.2 


Distribution of Science Scale Scores for Public School 
Students by Fields of Science 



Average 


10th 


25th 


50th 


75th 


90th 


Scale Score 


Percentile 


Percentile 


Percentile 


Percentile 


Percentile 



Physical Science 














Montana 


163 ( 1.2) 


126 ( 2.0) 


145 ( 1.4) 


165(1.4) 


183 ( 1.5) 


198(1.5) 


West 


146 (2.1) 


101 (4.2) 


126 ( 2.6) 


151 (2.2) 


172 ( 1.7) 


191 ( 3.3) 


Nation 

Earth Science 


149 ( 1.0) 


101 (2.0) 


126 ( 1.3) 


151 (1.2) 


173 ( 1.2) 


192 ( 1.6) 


Montana 


162 ( 1.3) 


125 (3.3) 


145 ( 1.4) 


164(1.4) 


182 ( 1.8) 


196 ( 1.2) 


West 


149 (2.1) 


102 (4.1) 


127 ( 3.4) 


152(1.3) 


173 (2.2) 


193 ( 2.8) 


Nation 

Life Science 


149 ( 1.0) 


101 (1.9) 


126 (1.5) 


150(1.2) 


173 ( 1.3) 


192 ( 1.9) 


Montana 


161 (1.5) 


125 (2.9) 


145 (3.1) 


163(1.4) 


179 ( 1.7) 


193(1.5) 


West 


147 (2.6) 


98 ( 5.3) 


125 (2.7) 


151 (2.8) 


172 (2.8) 


191 (5.3) 


Nation 


148 ( 1.1) 


100 (2.2) 


126 (1.3) 


151 (1.0) 


173 ( 1.1) 


191 ( 1.7) 



The NAEP science scale ranges fiom 0 to 3CX). The standard enors of the statistics ^pear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for ^e entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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Science Scale Score Results for Eighth-Grade 
Students by Subpopulations 

The previous chapter provided a view of the overall science performance of 
eighth-grade students in Montana and the nation. It is also important to examine the 
average performance of subgroups since past NAEP assessments in science, as well as 
in other academic subjects, have shown substantial differences among groups defined 
by gender, racial/ethnic backgroimd, parental education, and other demographic 
characteristics.^ A key contribution of NAEP to the ongoing conversations concerning 
education reform is the ability to monitor the performance of subgroups of students in 
academic achievement. 

The NAEP 1996 state assessment in science provides performance information for 
subgroups of eighth graders in Montana, the West region, and the nation. In addition 
to the more typical demographic subgroups defined by gender, race/ethnicity, and 
parental education, the 1996 assessment also collected information on two federally 
funded programs — student participation in Title I programs and services, and student 
eligibility for the free/reduced-price school limch program. 

The NAEP 1996 state assessment in science also continues a component first 
introduced with the NAEP 1994 state assessment in reading — assessment of a 
representative sample of nonpublic school students. The 1996 state assessment marks 
the first time that NAEP science results for public and nonpublic school students can 
be presented and compared at the state level. The comparison of public and nonpublic 
school students’ performance does not accoimt for confounding factors such as student 
composition, feunily socioeconomic status, and parental involvement in their child’s 
education. The size of the NAEP nonpublic school sample in most jurisdictions does 
not allow for such in-depth analyses, and a more complete picture of public and 
nonpublic school comparisons may be achieved by supplementing NAEP results with 
data from other sources, such as the School and Staffing Survey (SASS) “ or the 
National Education Longitudinal Study (NELS).“ 

^ Jones, L.R., I.V.S. Mullis, S.A. Raizen, LR. Weiss, and E.A. Weston. The 1990 Science Report Card: NAEP’s 
Assessment of Fourth^ Eighth, cuid Twelfth Graders, (Washington, DC: National Center for Vacation Statistics, 1992); 
Campbell, J.R., C.M. Reese, C. O’Sullivan, and J.A. Dossey. NAEP 1994 Trends in Academic Progress. (Washington, 
DC: National Center for Education Statistics, 1996). 

^ U.S. Department of Education. Schools and Staffing in the United States: A Statistical Profile, 1993-94. (Washington, 
DC: National Center for Education Statistics, 1996). URL: http://www.ed.gov J»lCES/surveys/sassJitml. 

^ National Education Longitudinal Study. National Education Longitudinal Study of 1988: Base Year Student Survey. 
(Washington, DC: National Center for Education Statistics, 1995). URL: http://www.ed.gov/NCES/surveys/nels88.html. 
Also see a report based on NELS:88 findings, NCES 97-838: Science Proficiency and Course Taking in High School. 
This is downloadable from URL: http://www.ed.gov/NCES/pubs97/97838.htnjl. 
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A description of the subgroups and how they are defined is presented in 
Appendix A. The reader is cautioned against making simple or causal inferences related 
to the performance of various subgroups of students or about the effectiveness of public 
and nonpublic schools or Title I programs. Average performance differences between 
two groups of students may in part be due to socioeconomic or other factors. For 
example, differences observed among racial/ethnic subgroups are almost certainly 
associated with a broad range of socioeconomic and educational factors not discussed 
in this report and possibly not addressed by the NAEP assessment program. Similarly, 
differences in performance between students eligible for Title I programs and those not 
eligible does not account for the initial performance level of the students prior to 
placement in Title I programs or differences in course content and emphasis between the 
two groups. 

Gender 

Previous NAEP results for science have shown a significant difference in the 
average scale scores of male and female eighth graders, with males having consistently 
higher scale scores.” As shown in Table 2.1, the NAEP 1996 state science assessment 
results for eighth graders in Montana are not consistent with those general findings. 



• The average science scale score of males did not differ significantly 
from that of females in either Montana or the nation. 
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TABLE 2.1 


Distribution of Science Scale Scores for Public School 
Students by Gender 


State Assessment 



Average 


10th 


25th 


50th 


75th 


90th 


Scale Score 


Percentile 


Percentile 


Percentile 


Percentile 


Percentile 



Male 














Montana 


164 ( 1.7) 


129 ( 3.5) 


148 ( 3.8) 


166(1.6) 


182 ( 2.0) 


197(1.4) 


West 


147 ( 2.2) 


98 (3.1) 


125 ( 3.8) 


152 ( 2.2) 


172 ( 1.8) 


191 (2.4) 


Nation 


149 ( 1.1) 


101 (1.8) 


126 ( 2.0) 


153(1.1) 


174 ( 1.2) 


192 ( 1.2) 


Female 














Montana 


160 ( 1.3) 


125 ( 3.3) 


144(1.9) 


162 ( 1.3) 


178 ( 0.8) 


191 (0.8) 


West 


149 ( 2.6) 


104 ( 5.2) 


129 ( 3.4) 


151 ( 2.4) 


171 (2.8) 


190 ( 4.5) 


Nation 


148 ( 1^) 


103 ( 1.3) 


127 ( 1.4) 


150(1.3) 


170 ( 1.7) 


189 ( 3.4) 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appe^ in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In conq)aring two estimates, one must use tte standard error of the 
difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



^ Campbell, J.R., K.E. Voelkl, and P.L. Donahue. NAEP 1996 Trends in Academic Progress. (Washington, DC: National 
Center for Education Statistics, 1997); Jones, L.R., I.V.S. Mullis, S.A. Raizen, I.R. Weiss, and E.A. Weston. The 1990 
Science Report Card: NAEP's Assessment of Fourth, Eighth, and Twelfth Grbders. (Washington, DC: National Center 
for Education Statistics, 1992). 
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Race/Ethnicity 

As part of the background questions a dmin istered with the NAEP 1996 science 
assessment, students were asked to identify the racial/ethnic subgroup that best describes 
them. The five mutually exclusive categories were White, Black, Hispanic, Asian or 
Pacific Islander, and American Indian or Alaskan Native. 

Findings from previous NAEP science assessments have shown that racial/ethnic 
differences exist in science performance.^ However, when interpreting differences in 
subgroup performance, confounding factors related to socioeconomic status, home 
environment, and educational opportunities available to students need to be 
considered.^ The distribution of eighth-grade science scale scores for Montana, the 
West region, and the nation by race/ethnicity are shown in Table 2.2.“ 



• White students in Montana demonstrated an average science scale score 
that was higher than those of Hispanic and American Indian students. 



TABLE 2.2 



Distribution of Science Scale Scores for Public School 
Students by Race/Ethnicity 



Average 


10th 


25th 


50th 


75th 


90th 


Scale Score 


Percentile 


Percentile 


Percentile 


Percentile 


Percentile 



White 














Montana 


166 ( 0.9) 


133(2.3) 


150 ( 0.7) 


167 ( 1.2) 


182 (0.9) 


196 ( 1.1) 


West 


158 ( 2.0) 


118 (2.7) 


140 ( 22) 


159 (2.7) 


178 (2.1) 


196(4.9) 


Nation 


159 ( 1.1) 


120(1.3) 


140(12) 


160 ( 12) 


179 ( 12) 


196 ( 1.8) 


Hispanic 














Montana 


147 ( 2.7) 


114 (9.8) 


129 (13.0) 


147 ( 5.0) 


164 (11.0) 


179 ( 6.4) 


West 


127 ( 2.6) 


79 ( 3.8) 


102 ( 2.5) 


128 ( 3.5) 


152 (4.8) 


170 ( 4.5) 


Nation 


127 ( 1.8) 


83 ( 3.3) 


104(2.6) 


129(1.6) 


152 (2.7) 


170 (2.8) 


American Indian 














Montana 


139 (2.7) 


103 ( 6.7) 


121 ( 3.9) 


140 ( 1.7) 


157 ( 1.9) 


173(3.8) 


West 


152 ( 5.0)1 


110 (17.0)! 


132 (12.3)] 


153 ( 4.8)1 


173 ( 2.9)1 


194 (5.1)1 


Nation 


148 ( 42) 


106 ( 8.7) 


125 ( 9.5) 


151 ( 5.5) 


168 (5.0) 


186 (8.5) 



The NAEP science scale ranges from 0 to 300. Results are reported for racial/ethnic subgroups meeting established 
sample size requirements (see Appendix A). The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see ^pendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of this statistic. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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Campbell, J.R., ICE Voelkl, and P.L. Donahue. NAEP 1996 Trends in Academic Progress. (Washington, DC: National 
Center for Education Statistics, 1997); Jones, L.R., I.V.S. MuUis, S.A. Raizen, I.R. Weiss, and E.A. Weston. The 1990 
Science Report Card: NAEP's Assessment of Fourth, Eighth, and Twelfth Graders. (Washington, DC: National Center 
for Education Statistics, 1992). 

^ McKenzie, F.D. “Educational Strategies for the 1990s.” The Stale of Black America 1991. (New Yoric: National Urban 
League, 1991). 
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Results are reported for racial/ethnic subgroups meeting established sample size requirements (see Appendix A). 
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Students’ Reports of Parents’ Highest Education Levei 

Students were asked to indicate the highest level of education completed by each 
parent. Four levels of education were identified: did not finish high school, graduated 
from high school, some education after high school, and graduated from college. A 
choice of “I don’t know” was also available. For this analysis, the highest education 
level reported for either parent was used. 

In general, results show that increasing parental education is associated with 
increases in student performance. In reviewing these results, it is important to note that, 
nationally, approximately 10 percent of eighth graders did not know the level of 
education that either of their parents had completed. For public school students in 
Montana, this percentage was 6 percent. Despite the fact that some research has 
questioned the accuracy of student-reported data from similar groups of students,^’ past 
NAEP assessments in science, as well as other subject areas, have found that 
student-reported level of parental education exhibits a consistent positive relationship 
with student performance on the assessments.^^ Other research has corroborated NAEP 
findings.^^ 

Table 2.3 shows the results for eighth-grade public school students reporting that 
neither parent graduated from high school, at least one parent graduated from high 
school, at least one parent received some education after high school, at least one parent 
graduated from college, or that they did not know their parents’ highest education level. 
The following pertains to those students who reported knowing the educational level of 
one or both parents. 

• The average science scale score of students in Montana who reported 
that neither parent graduated from high school was lower than that of 
students who reported that at least one parent graduated from high 
school, at least one parent received some education after high school, 
or at least one parent graduated from college. 



Lcx>ker, E.D. “Accuracy of Proxy Reports of Parental Status Characteristics.” Sociology of Education^ 62(4), pp. 
257-276, 1989. 

Jones, L.R., I.V.S. MuUis, S.A. Raizen, LR. Weiss, and E.A. Weston. The 1990 Science Report Card: NAEP’s 
Assessment of Fourth, Eighth, and Twelfth Graders. (Washington, DC: National Center for Mucation Statistics, 1992); 
Campbell, J.R., K.E. Voelkl, and P.L. Donahue. NAEP 1996 Trends in Academic Progress. (Washington, DC: Nation^ 
Center for Educat.on Statistics, 1997); Reese, C.M., ICE. Miller, J. Mazzeo, and J.A. Dossey. NAEP 1996 Mathematics 
Report Card. (Washington, DC: National Center for Education Statistics, 1997). 

National Education Longitudinal Study. National Education Longitudinal Study of 1988: Base Year Student Survey. 
(Washington, DC: National Center for Education Statistics, 1995). 
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TABLE 2.3 


Distribution of Science Scale Scores by Public School 
Students* Reports of Parents* Highest Education Level 


State Assessment 



Average 


10th 


25th 


50th 


75th 


90th 


Scale Score 


Percentile 


Percentile 


Percentile 


Percentile 


Percentile 



Did not finish high school 
























Montana 


139 I 


(3.1) 


106 (20.6) 


120 


(7.4) 


137 1 


(4.0) 


159 ( 


(3.5) 


174 ( 


(3.2) 


West 


127 ( 


(3.6) 


77 ( 8.5) 


102 


(1.5) 


129 1 


(4.6) 


153 ( 


(5.8) 


170 ( 


(6.4) 


Nation 


131 ( 


(2.0) 


86 ( 3.0) 


108 


(2.6) 


134 1 


(4.0) 


153 ( 


(5.6) 


170 ( 


(3.7) 


Graduated from high school 
























Montana 


155 ( 


(2.2) 


124 ( 3.8) 


139 


(4.7) 


156 1 


(2.3) 


174 ( 


(3.7) 


187 ( 


(2.7) 


West 


138 1 


(2.1) 


98 { 3.7) 


119 


(4.7) 


139 1 


(2.2) 


159 1 


(1.6) 


178 ( 


(2.1) 


Nation 


1401 


(1.5) 


98 ( 2.0) 


119 


(2.1) 


142 1 


(1.6) 


163 1 


(1.4) 


181 1 


(1.2) 


Some education after HS 
























Montana 


164 1 


[ 1.5) 


135(2.1) 


151 


(2.1) 


165 1 


(1.4) 


179 ( 


(1.7) 


193 ( 


(3.6) 


West 


1561 


(1.9) 


116(5.4) 


139 


(2.5) 


159 1 


(4.4) 


175 ( 


(4.2) 


189 1 


(3.4) 


Nation 


155 1 


(1.2) 


113(1.0) 


137 


(1.5) 


158 1 


(2.8) 


1761 


(2.2) 


191 ( 


(1.4) 


Graduated from college 
























Montana 


168 1 


[1.3) 


135 (2.8) 


153 


(1.4) 


170 1 


(1.4) 


185 1 


(1.4) 


199 1 


(3.5) 


West 


159 ( 


(2.7) 


116(6.4) 


141 


(2.0) 


160 1 


(3.4) 


182 1 


(4.9) 


199 1 


(4.2) 


Nation 


157 ( 


[1.3) 


112 (2.1) 


137 


(1.0) 


160 1 


(1.4) 


180 1 


[ 1.5) 


198 1 


(1.3) 


1 don’t know. 
























Montana 


147 ( 


[3.6) 


112 (23.0) 


130 


(6.4) 


149 1 


(3.1) 


165 1 


(6.6) 


178 ( 


(2.4) 


West 


127 ( 


[2.9) 


84 (2.2) 


105 


(4.9) 


128 1 


(4.8) 


151 ( 


(7.1) 


168 1 


(4.9) 


Nation 


1331 


[2.6) 


88 ( 3.6) 


109 


(3.6) 


134 1 


(6.5) 


157 1 


(3.8) 


174 ( 


(4.4) 



The NAEP science scale ranges from 0 to 300. Results are reported for parental education subgroups meeting established 
sample size requirements (see Appendix A). The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In conq}aring two estimates, one must use the standard error of the 
difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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Title I Participation 

The Improving America’s Schools Act of 1994 (P.L. 103-382) reauthorized the 
Elementary and Secondary Education Act of 1965 (ESEA). Title I Part A of the ESEA 
provides financial assistance to local educational agencies to meet the educational needs 
of children who are failing or most at risk of failing.” Title I programs are designed 
to help disadvantaged students meet challenging academic performance standards. 
Through Title I, schools are assisted in improving teaching and learning and in providing 
students with opportunities to acquire the knowledge and skills outlined in their state’s 
content and performance standards. For high poverty Title I schools, all children in the 
school may benefit through participation in schoolwide programs. Title I funding 
supports state and local education reform efforts and promotes coordinating of resources 
to improve education for all students. 

NAEP first collected student-level information on participation in Title I programs 
in 1994. The NAEP program will continue to monitor the performance of Title I 
program participants in future assessments. The Title I information collected by NAEP 
refers to current participation in Title I services. Students who participated in such 
services in the past but do not currently receive services are not identified as Title I 
participants. Differences between students who receive Title I services and those who 
do not should not be viewed as an evaluation of Title I programs. Typically, Title I 
services are intended for students who score poorly on assessments. To properly 
evaluate Title I programs, the performance of students participating in such progr ams 
must be monitored over time and their progress must be assessed.^* 

Table 2.4 presents results for eighth-grade students by Title I participation. 

• For students receiving Title I services, the average science scale score 
of students in Montana (137) was not significantly different from* that 
of students nationwide (127). The average scile score of Montana 
students who were not receiving Title I services (164) was higher than 
that of their national counterparts (152). 

• The average scale score of Montana students who were receiving Title 
I services was lower than that of students who were not. 



* Although the difference may appear large, recall that “significance” here refers to “statistical significance ” 

34 

U.S. Department of Education, Office of Elementary and Secondary Compensatory Education Programs. Improving 
Basic Programs Operated by Local Education Agencies. (Washington, DC: U.S. Department of Education, 1996). 

For a study of mathematics performance of Title I students in 1991-1992, see U.S. Department of Education, 
PROSPECTS: The Congressionally Mandated Study of Educational Growth and Opportunity, Interim Report: Language 
Minority and Limited English Proficient Students. (Washington, DC: U.S. Department of Education, 1995). 
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TABLE 2.4 


Distribution of Science Scale Scores for Public School 
Students by Title I Participation 



Average 


10th 


25th 


50th 


75th 


90th 


Scale Score 


Percentile 


Percentile 


Percentile 


Percentile 


Percentile 



Participating 














Montana 


137 (2.7) 


106 (10.8) 


122 (2.4) 


138(4.9) 


153 ( 3.4) 


167(5.8) 


West 


134 ( 6.6)1 


86 ( 6.9)1 


109 ( 8.5)! 


135 (7.6)1 


159 (8.5)1 


179 (2.9)1 


Nation 


127 ( 4.9) 


82 (4.1) 


102(5.1) 


126 ( 5.5) 


152 ( 6.2) 


170 ( 7.4) 


Not participating 














Montana 


164 (1.3) 


131 ( 1.5) 


149 ( 1.3) 


166(1.3) 


181 ( 1.0) 


195 ( 0.9) 


West 


151 (2.2) 


105 ( 3.9) 


131 (2.5) 


154 (2.2) 


174 ( 2.7) 


192 ( 4.5) 


Nation 


152(1.2) 


107 (1.8) 


131 ( 1.6) 


154 ( 1.3) 


174 ( 1.3) 


192 ( 2.3) 



The NAEP science scale ranges from 0 to 300. Results are reported for students participating in Title I programs only 
if established sample size requirements are met (see Appenc^ A). The standard errors of the statistics appear in 
parentheses. It can be said with about 95 percent confidence that, for each population of interest, the value for the entire 
population is within ± 2 standard errors of the estimate for the san^)le. In comparing two estimates, one must use the 
standard error of the difference (see Appendix A for details). ! Interpret with caution — the nature of the san^)le does 
not allow accurate determination of the variability of this statistic. 

SOURCE: National Center for Education Statistics, Nationa] Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



Free/Reduced-Price Lunch Program Eligibility 

The free/reduced-price lunch component of the National School Lunch Program 
(NSLP), offered through the U.S. Department of Agriculture (USDA), is designed to 
ensure that children near or below the poverty line receive nourishing meals.^* 
Eligibility for free or reduced-price meals is determined through the USDA’s Income 
Eligibility Guidelines; it is included in this report as an indicator of poverty. The 
program is available to public schools, nonprofit private schools, and residential child 
care institutions. 

NAEP first collected information on student-level eligibility for the federally 
funded NSLP in 1996. The NAEP program will continue to monitor the performance 
of these students in future assessments. 



U.S. General Services Administration. Catalog of Federal Domestic Assistance. (Washington, DC: Executive Office 
of the President, Office of Management and Budget, 1995). 
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Table 2.5 shows the results for eighth graders based on their participation in this 
program. 



• For students who were eligible for free or reduced-price lunch, the 
average science scale score of students in Montana (150) was higher 
than that of students nationwide (133). Similarly, the average scale 
score of students who were not eligible for this service was higher for 
Montana (166) than for the nation (155). 

• The average scale score of Montana students who were eligible for free 
or reduced-price lunch was lower than that of students who were not. 
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TABLE 2.5 


Distribution of Science Scale Scores for Public School 
Students by Free/Reduced-Price Lunch Eligibility 



Average 


10th 


25th 


50th 


75th 


90th 


Scale Score 


Percentile 


Percentile 


Percentile 


Percentile 


Percentile 



Eligible 














Montana 


150 ( 2.0) 


113 (4.0) 


131 ( 3.1) 


152 ( 1.8) 


170 ( 1.8) 


186 ( 2.8) 


West 


134 ( 32) 


86 ( 4.4) 


111 (2.7) 


136 ( 4.4) 


159 ( 2.6) 


177(3.1) 


Nation 


133(1-7) 


87 ( 3.0) 


108 ( 2.0) 


133 ( 2.0) 


157 0-7) 


176 ( 2.5) 


Not eligible 














Montana 


166 ( 12) 


134 ( 2.5) 


151 ( 0.9) 


168(1.6) 


183(12) 


196 ( 1.1) 


West 


152 ( 1.7) 


110 (4.9) 


133 ( 32) 


155 (2.1) 


173 ( 2.4) 


189 (2.3) 


Nation 


155 ( 1.3) 


114 (2.6) 


136(1.4) 


157(1.7) 


176(12) 


194 (2.8) 


Information not available 














Montana 


165 ( 1.9) 


135 (42) 


151 ( 2.9) 


166 (2.2) 


180(1.7) 


194 (22) 


West 


154 ( 7.6)! 


105 (13.5)1 


133 ( 9.8)1 


157 (7.3)! 


180 ( 9.8)! 


199 ( 9.7)1 


Nation 


154 ( 3.6)1 


109 ( 62)1 


134 ( 5.0)1 


157 (2.8)! 


178 ( 2.9)! 


196 (4.7)1 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics £^>pear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In conq>aring two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of this statistic. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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Type of School 

The NAEP 1996 state assessment marks the first time that nonpublic school 
students were assessed in science at the state level. Therefore, separate nonpublic school 
results can be reported for Montana. Also, results based on a combined sample of public 
and nonpublic school students are presented. 

In 1996, approximately 95 percent of eighth graders in Montana attended public 
schools, with the remaining students attending nonpublic schools (including Catholic and 
other private schools). For the nation, 89 percent of students at grade 8 attended public 
schools in 1996. 

Previous NAEP science assessments and other survey research on educational 
achievement have found significant differences in the performance of students attending 
public and nonpublic schools.^’ The reader is cautioned against using NAEP results to 
make simplistic inferences about the relative effectiveness of public and nonpublic 
schools. Average performance differences between the two types of schools may, in 
part, be related to socioeconomic and sociological factors, such as levels of parental 
involvement in their child’s education. To get a clearer picture of the differences 
between public and nonpublic schools, more in-depth investigations must be conducted 
that are beyond the scope of the NAEP state assessment program. 

Table 2.6 shows the distribution of science scale scores for the public, nonpublic, 
and combined eighth-grade populations in Montana, the West region, and the nation. 

• In Montana, the average scale score of public school students (162) was 
not significantly different from that of nonpublic school students (158). 

• The average science scale score of students attending nonpublic schools 
in Montana (158) was not significantly different from that of nonpublic 
school students across the nation (162). 

• The average science scale score of public and nonpublic school students 
combined in Montana was 162. This average was higher than that of 
students nationwide (150). 



37 

O’Sullivan, C.Y., C.M. Reese, and J. Mazzeo. NAEP 1996 Science Report Card for the Nation and the States. 
(Washington, DC: National Center for Education Statistics, 1997); Campbell, J.R*, K.E. Voelki, and P.L. Donahue. 
NAEP 1996 Trends in Academic Progress. (Washington, DC: National Center for Education Statistics, 1997). 
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TABLE 2.6 


Distribution of Science Scale Scores by Type of School 


State Assessment 



Average 


10th 


25th 


50th 


75th 


90th 


Scale Score 


Percentile 


Percentile 


Percentile 


Percentile 


Percentile 



Public 

Montana 


162 


(1.2) 


127 ( 2.6) 


146(1.7) 


164 ( 1.2) 


180 ( 0.6) 


194(1.9) 


West 


148 


(2.2) 


101 ( 3.3) 


127 ( 3.1) 


151 (2.0) 


172 ( 1.7) 


190(3.7) 


Nation 


148 


(0.9) 


102 ( 1.6) 


126(1.3) 


151 ( 0.9) 


172 ( 1.1) 


191 ( 1.3) 


Nonpublic 
















Montana 


158 


(8.6)! 


107 (20.3)! 


135 (12.0)! 


161 (8.9)! 


184 ( 4.8)! 


197 (11.2)1 


West 


165 


(6.0)! 


128 ( 6.3)1 


148 ( 8.4)! 


167 ( 6.6)! 


184 ( 7.3)! 


198(4.9)! 


Nation 


162 


(2E) 


123(8.1) 


143(3.1) 


164 ( 3.0) 


182(2.8) 


199(2.1) 


Combined 
















Montana 


162 


(1.2) 


126(2.1) 


145 ( 2.2) 


164 ( 1.2) 


180 ( 0.7) 


194(1.9) 


West 


149 


{ 22 ) 


102 (2.9) 


128 ( 2.7) 


153 ( 2.6) 


173(2.3) 


191 ( 2.8) 


Nation 


150 


(0.9) 


104 ( 1.0) 


128 ( 1.1) 


*153(0.8) 


174(1.4) 


192(1.4) 



The NAEP science scale ranges &om 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of ^e variability of this statistic. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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PART TWO 



Finding a Context for Understanding 
Students’ Science Performance in Pubiic 
Schoois 

he science performance of public school students in Montana can be better 
understood when viewed in the context of the environment in which the students are 
learning. This educational environment is largely determined by school characteristics, 
by characteristics of science instruction in the school, by home support for academics 
and other home influences, and by the students’ own views about science. NAEP 
gathers information about this environment by means of the questionnaires administered 
to principals, teachers, and students. 

Because NAEP is administered to a sample of students that is representative of the 
eighth-grade student population in the schools of Montana, NAEP results provide a view 
of the educational practices in Montana, useful for improving instruction and setting 
policy. However, despite the richness of the NAEP results, it is very important to note 
that NAEP data caimot establish a cause-and-effect relationship between educational 
environment and student scores on the NAEP science assessment. 

The variables contained in Part Two are from the school characteristics and 
policies questioimaire, teacher questionnaires, and student background questioimaires. 
Part Two consists of four chapters: Chapter 3 discusses school characteristics related 
to science instruction;^® Chapter 4 describes classroom practices related to science 
instruction, including curriculum, instructional emphases, coursework, and computer use; 
Chapter 5 describes portions of a hands-on task and explores student exposure to these 
experiences; and Chapter 6 covers some potential influences from the home and from 
the students’ own views about science. 

To provide additional information, the bullets below sometimes contain combined 
results from one or more categories (i.e., collapsed categories). When this is the case, 
the summed numbers reported in the bullets may be slightly different from the sums of 
the rounded numbers presented in the tables for each of the categories. 



ERIC 
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Information on teacher preparation is included in Appendix D of this report 
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CHAPTER 3 



School Science Education Policies and 
Practices 

School programs and conditions, instructional practices, and resource availability 
vary from state to state and even among schools within a locality. The information in 
this chapter is intended to give insight into those policies or practices that are associated 
with students’ success in science. 

The variables reported here reflect information from the questionnaires completed 
by principals and teachers of the public school students in the NAEP 1996 science 
assessment. In all cases, analyses are done at the student level. School and 
teacher-reported results are given in terms of the percentage of students who attend 
schools or who have teachers reporting particular practices.*® 

Emphasis on Science in the School 

In the school characteristics and policies questionnaire, principals or other head 
administrators were asked several questions relating to the priority placed on science 
within their schools. Table 3.1 presents their responses. 

• The percentage of eighth-grade students in Montana who attended 
schools with a special focus on science (2 percent) was smaller than the 
national percentage (8 percent). 

• The percentage of eighth-grade students in Montana attending schools 
that reported science was a priority (25 percent) was smaller than the 
national percentage (43 percent). The average scale score for students 
in these schools (165) was higher than that of students in schools 
nationwide reporting that science was a priority (147). 

• The average scale score of students in Montana schools that reported that 
science was a priority (165) was not signifrcantly different from that of 
students in schools where science was not a priority (161). 

• The percentage of eighth-grade students in Montana who attended 
schools that reported having a district or state curriculum that the school 
was expected to follow (87 percent) was not significantly different 
from* the national percentage (94 percent). 



♦ Although the difference may appear large, recall that “significance” here refers to “statistical significance.” 

39 

Appendix A provides more details on the units of analysis used to derive the results presented in this report. 
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TABLE 3.1 


Public Schools^ Report on Science as a Priority 


State Assessment 



Montana 


West 


Nation 


Percentage and Average Scale Score 



Is this a school with a special 
focus on science?* 








Yes 


2(1.2) 


8 ( 3.9) 


8(2,7) 


Has your school identified science 
as a priority in the last two years? 


(-.*) . 


138 ( 4.9)! 


137 (5.0)1 


Yes 


25 ( 3.5) 


36 (10.0) 


43 ( 6.8) 




165 (2.3) 


150(8.1)1 


147 ( 3.3) 


No 


75 ( 3.5) 


64(10.0) 


57 ( 6.8) 


Does your district or state have a 
curriculum in science that your school 
is expected to follow?* 


161 ( 1.5) 


149 ( 1.9) 


. 151 ( 1.7) 


Yes 


87 ( 3.2) 


96 ( 2.1) 


94 ( 2.0) 




162 ( 1.4) 


150 ( 2.3) 


149 ( 1.0) 



The NA£P science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for ^e entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). * The response category “No” was inappropriate here because the question 
permitted several options to be selected; consequently, only *Tes” responses were t^ed. ! Interpret with caution — the 
nature of the sample does not allow accurate determination of the variability of this statistic. *** Sample size is 
insufficient to permit a reliable estimate. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



Principals were also asked how often students received science instruction. 
Schools using block scheduling (i.e., extended periods of instruction on fewer days) were 
not separately identified. Consequently, students in schools with block scheduling who 
receive science instruction two or three times weekly may receive as many hours of 
instruction as students under traditional scheduling who receive instruction every day. 
Table 3.2 shows the following: 

• The average scale score for students receiving science instruction every 
day (162) was higher than that of students nationwide receiving this 
much instruction (150). 
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TABLE 3.2 


Public Schools^ Reports on Time Spent in Science Instruction 



How often does a typical 


Montana 


West 


Nation 


eighth-grade student in your school 
receive instruction in science? 


Percentage and Average Scale Score 



Twice a week or less/Not taught 


or***) 


0 (****) 


0 (***•) 


*•* (**.*) 


*** (**.*) 




Three or four times a week 


1 


18(7.7) 


8 ( 2.7) 




-* (**.*) 


*** (**.*) 


147 ( 4.8)! 


Every day 


98 (****) 


82 ( 7.7) 


92 ( 2.7) 


162 (1.3) 


149 ( 3.4) 


150 ( 1.2) 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of this statistic. *** San^le size is insufficient to permit a reliable estimate. 
**** Standard error estimates cannot be accurately determined. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



Resource Availability to Teachers 

Resources available to teachers and schools vary. Past surveys have shown that 
teachers’ perceptions of the availability of resources (e.g., materials, staff, and time) are 
variable across the country."* Previous NAEP assessments in other subject areas have 
shown an overall positive relationship in most states between teachers’ reports of 
resource availability and their students’ performance.^' 

Availability of Instructional Materials 

Teachers often see the lack of resources and materials as a key problem for science 
instmction. In 1993 a national survey of elementary and secondary school educators 
reported that deficiencies related to instructional resources were the most serious 
problems for science instruction in their schools/^ In that survey, schools reported 
spending a total of $0.51 per elementary student per year and $0.88 per middle grade 
student per year on science supplies, and $50 per year on science software. (The average 
price for one piece of software is $1(K).) 



^ U.S. Depaitment of Education. Schools and Staffing in the United States: A Statistical Profile^ 1993-94. (Washington, 
DC: National Center for Education Statistics, 1996). 

For example, see Miller, K.E., J.E. Nelson, and M. Naifeh. Cross-State Data Compendium for the NAEP 1994 
Grade 4 Reading Assessment. (Washington, DC: National Center for Education Statistics, 1995); National Center for 
Education Statistics. State-by-State Background Questionnaire Data Appendix: NAEP 1992 Madtematics Assessment, 
Grades 4 and S. (Washington, DC: Office of Educational Research and Improvement, 1994). 

Weiss, I.R. A Profile of Science and Mathematics Education in the United States: 1993. (Chapel Hill, NC: Horizon 
Research, 1994). 
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Teachers whose students participated in the NAEP 1996 science assessment were 
asked to categorize how well their school systems provided them with the classroom 
instructional materials they needed. The results are shown in Table 3.3. 

• Relatively few of the students in Montana had teachers who reported 
receiving all the resources they needed (10 percent). This percentage 
was not significantly different from that of students across the nation 
(11 percent). 

• The average science scale score of students in Montana whose teachers 
reported receiving all the resources they needed (158) was not 
significantly different from that of students whose teachers received 
some or none of the resources they needed (161). 
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TABLE 3.3 


Public School Teachers* Reports on Resource Availability 


state Asssssnnnt 



Montana 


West 


Nation 


Percentage and Average Scale Score 



Which of the following statements is true 
about how well your school system provides 
you with the instructional materials and other 
resources you need to teach your class? 



1 get some or none of the resources 1 need. 


31 ( 4.0) 


39 ( 7.0) 


37(4.1) 




161 (1.8) 


146 ( 3.0) 


144 ( 2.0) 


1 get most of the resources 1 need. 


59 ( 4.3) 


54(7.1) 


52(4.1) 




164(1.1) 


152(3.8)1 


153(2.1) 


1 get all the resources 1 need. 


10 ( 3.8) 


7(3.0) 


11 (3.1) 




158(5.6)1 


- (-.•) 


154 ( 5.4)1 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of this statistic. *** Sample size is insufficient to permit a reliable estimate. 

SOURCE: National Center for Education Statistics, Nati<^ Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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Avaflabilitv of Cmriculum Specialist in the School 

Table 3.4 shows the percentages and average scale scores of eighth-grade students 
in public schools whose teachers indicated they had a curriculum specialist available to 
help or advise them in science. 

• In Montana, about one quarter of the students were taught by teachers 
who reported that there was a curriculum specialist available to help or 
advise them in science (27 percent). This figure was smaller than that 
of students across the nation (43 percent). 
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TABLE 3.4 


Public School Teachers’ Reports on Cuniculum Specialists 


State Assessment 



is there a curriculum specialist 
available to help or advise you in 
science? 



Montana 


West 


Nation 


Percentage and Average Scale Score 



Yes 


27 ( 3.8) 


35 ( 6.3) 


43 ( 3.9) 




162 (2.1) 


151 ( 7.6)! 


148 ( 2.7) 


No 


73 ( 3.8) 


65 ( 6.3) 


57 ( 3.9) 




163(1.5) 


149 ( 2.0) 


152(1.5) 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In con^aring two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of this statistic. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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Parents as Classroom Aides 

When school personnel and parents develop a positive line of communication, they 
strengthen the learning environment for the students both at school and at home. One 
of the most frequent reasons cited by school persoimel for contacting parents is to 
request parent volunteer time at school."*^ The principals of the participating public 
schools were asked if parents were used as classroom aides. As shown in Table 3.5, 
principals for eighth graders reported the following: 

• Relatively few of the students in Montana (10 percent) were in schools 
that reported routinely using parents as aides in classrooms while 
49 percent of students in Montana attended schools where parents were 
not used as classroom aides. 
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TABLE 3.5 


Public Schools* Reports on Parents as Aides in Classrooms 






Does your school use parents as 
aides in classrooms? 


Montana 


West 


Nation 


Percentage and Average Scale Score 











No 


49 ( 3.9) 


30(6.1) 


43 ( 6.0) 




162 ( 1.7) 


141 ( 2.4)1 


146 ( 2.4) 


Yes, occasionally 


41 ( 4.3) 


65 ( 7.0) 


46 ( 6.3) 




162 ( 1.8) 


151 ( 3.4) 


150 ( 2.7) 


Yes, routinely 


10 ( 3.5) 


5(-*) 


11 ( 3.6) 




161 (4.9)1 


- (-.*) 


152 ( 6.9)1 



The NAEP science scale ranges from. 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the san^le. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does n<^ allow accurate 
determination of the variability of tWs statistic. ••• Sample size is insufficient to permit a reliable estimate. 
**** Standard error estimates cannot be accurately determined. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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U.S. Department of Education. The Condition of Education 1995. (Washington, DC: National Center for Education 
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Student Absenteeism 

School principals were asked if student absenteeism was a serious, moderate, or 
minor problem, or not a problem. Table 3.6 shows results for eighth graders based on 
principals’ reports. 

• In Montana, 29 percent of the eighth-grade public school students 
attended schools that reported that absenteeism was a moderate to 
serious problem. This percentage was not significantly different from* 
that for the nation (22 percent). 

• The average scale score of students in Montana attending schools that 
reported that absenteeism was not a problem (163) was higher than that 
of students in schools where absenteeism was a moderate to serious 
problem (155). 
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TABLE 3.6 



Public Schools* Reports on Student Absenteeism 



To what degree is student 
absenteeism a problem in your 
school? 


Montana 


West 


Nation 


Percentage and Average Scale Score 




Not a problem 


24 ( 3.2) 


16 ( 8.5) 


28 ( 4.8) 




163 (2.1) 


164 (10.4)1 


156(3.1) 


Minor 


47 ( 3.7) 


65 ( 8.5) 


50 ( 4.9) 




166 ( 1.1) 


149 ( 1.7) 


149(1.5) 


Moderate to serious 


29 ( 3.8) 


19(5.7) 


22 ( 3.7) 




155 (2.9) 


138 (7.9)1 


140 ( 3.0) 



The NAEP science scale ranges ftom 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the est im a t e for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of this statistic. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



♦ Although the difference may appear large, recall that “significance” here refers to “statistical significance.” 
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CHAPTER 4 



Science Classroom Practices 

Science education in the nation’s schools has received considerable attention at the 
national, state, district, school, and classroom levels. In recent years, a number of 
national and international programs have measured student performance in science. The 
latest national trend report indicates that although eighth graders’ scores have shown 
recent increases, there is no significant difference in average scores between 1970 and 
1996." A recent international study, the Third International Mathematics and Science 
Study (TIMSS), demonstrated that eighth-grade students’ performance in the United 
States was slightly above average compared with that of students in 40 other 
countries.'*’ 

Using guidance from such programs as the Statewide Systemic Initiative, Project 
Scope, Sequence, and Coordination, Benchmarks for Science Literacy, and the National 
Science Education Standards,"^ many states are currently involved in re-evaluating their 
existing standards and developing new frameworks and criteria for science instruction 
in their state. TIMSS has also pointed out some differences between classroom practices 
in the United States and in the 40 other participating nations that may guide development 
of more effective science instruction.'” This chapter focuses on curricular and 
instructional content issues in Montana public schools and their relationship to students’ 
science performance. 

For some of the issues discussed in this chapter, student- and teacher-reported 
results for similar questions are presented. In these situations, some discrepancies may 
exist between student- and teacher-reported percentages. It is not possible to offer 
conclusive reasons for these discrepancies or to determine whose reports more accurately 
reflect eighth-grade classroom activities. The results merely present students’ and 
teachers’ impressions of the science classroom. 



^ Campbell, J.R., K.E. Voelkl, and PX. Donahue. NAEP 1996 Trends in Academic Progress. (Washington, DC: National 
Center for Education Statistics, 1997). 

45 

Beaton, A.E., MO. Martin, I.V.S. Mullis, E.J. Gonzalez, T.A. Smith, and D.L. Kelly. Science Achievement in the 
Middle School Years: lEA"s Third International Mathematics and Science Study (TIMSS). (Chesmut Hill, MA: TIMSS 
International Study Center, 1996). 

46 

National Science Foundation, 1990, Statewide Systemic Initiative, provided grants to further research and initiatives 
in science reform; Scope^ Sequence and Coordination of High School Science. Vol.l. The Content Core: A Guide for 
Curriculum Developers. (Washington, DC: National Science Teachers Association, 1992); American Association for 
the Advancement of Science. Benchmarks for Science Literacy. (New Yoric: Oxford University Press, 1993); National 
Research Council. National Science Education Standards. (Washington, DC: National Academy Press, 1996). 
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National Center for Education Statistics. Pursuing Excellence. (Washington, DC: U.S. Government Printing Office, 
1996). 
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Curriculum Coverage 

The NAEP 1996 science assessment examines three fields of science: earth, 
physical, and life. In grades 4 and 12, the 1996 NAEP framework emphasized the three 
fields of science more or less equally; however, the framework specified a heavier 
emphasis on life science at grade 8, consistent with the increasingly recognized 
importance of human biology for this age group.‘“ Eighth-grade public school teachers 
were asked how much time was spent on the three traditional fields of science in their 
classes and the results are presented in Table 4.1. 

• In Montana, 16 percent of the eighth-grade public school students had 
teachers who reported spending a lot of time on earth science. This 
percentage was smaller than that for the nation (41 percent). Students 
in Montana in classrooms where a lot of time was spent on earth science 
had an average scale score (158) that did not differ significantly from* 
that of similar students nationwide (149). 

• In Montana, 81 percent of the public school students had teachers who 
reported spending a lot of time on physical science. This figure was 
greater than that of their national counterparts (49 percent). The 
average science scale score in classrooms where physical sciences was 
covered a lot was higher in Montana (163) than nationwide (151). 

• In Montana, 13 percent of the students had teachers who reported 
spending a lot of time on life science. This was not significantly 
(Cerent from* the percentage nationwide (19 percent). The average 
scale score for students in these classrooms (157) was higher than that 
of students across the nation spending a lot of time on life science (147). 



* Although the difference may appear large, recall that ''significance” here refers to "statistical significance.” 

^ National Research Council. National Science Education Standards. (Washington, DC: National Academy Press, 1996). 
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TABLE 4.1 



Public School Teachers^ Reports on Curriculum Coverage 



Montana 


West 


Nation 


Percentage and Average Scale Score 



How much time do you spend on 
each of the following areas of 
science in this class? 



Earth science 



Physical science 



Life science 



None 


32 ( 4.7)t 


4(2.0) 


7(1.8)t 




163 (2.1) 


- (-/) 


153 ( 4.4)! 


A little 


27 ( 3.6)t 


5(2.8) 


11 (3.1)t 




163 ( 3.2) 


- (-.•) 


153 ( 5.6)! 


Some 


24(4.1)t 


48 ( 9.5) 


41 ( 5.0)t 




164 ( 1.9) 


149 ( 2.6) 


151 (2.1) 


A lot 


16(4.1)t 


44(10.7) 


41 ( 5.6)t 




156 (5.0)! 


151 ( 7.3)! 


149 ( 2.9) 


None 


1 r^*)t 


3(2.2) 


3(1.2)t 




- (-.*) 


^ (-/) 


141 ( 9.5)1 


A little 


6(2.8)t 


16(4.7) 


12 ( 3.6)t 




159 ( 1.6)1 


154(4.0)! 


152 (4.4)! 


Some 


12 ( 3.2)t 


33 ( 7.6) 


36 ( 4.9)t 




158 (5.7)1 


152 ( 4.2)1 


152 (2.6) 


A lot 


8T(4.0)t 


46 (9.1) 


49 ( 4.9)t 




163 ( 1.3) 


146 ( 2.6)1 


151 ( 1.8) 


None 


32 ( 4.9)t 


7(2.7) 


17(5.1)t 




163 (2.0) 


151 (9.3)1 


155(5.0)! 


A little 


36 ( 4.8)t 


24(6.1) 


22(4.1)t 




164 (2.6) 


155(7.7)! 


152 ( 3.5) 


Some 


19(4.6)t 


42 ( 6.6) 


41 (6.1)t 




160 (4.6)1 


150 (2.6)1 


149(2.5) 


A lot 


13(4.5)t 


27 (10.0) 


19 ( 4.7)t 




157 ( 3.7)! 


144(3.4)1 


147 ( 2.6)1 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for 1^ entire population is within ± 2 
standard errors of the estimate for the sample. In con^aring two estimates, one must use t^ standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of this statistic. *** Sample size is insufficient to permit a reliable estimate. 
**** Standard error estimates cannot be accurately determined, t Interpret with caution — more than 15% of the 
respondents did pot answer this question. 

SOURCE: National Center for ^ucation Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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Eighth’Grade Students’ Course Taking 

Exposure to science and the opportunity to learn science have a positive effect on 
the science performance of students.^* To investigate whether there is a relationship 
between science performance of students on the 1996 NAEP assessment and their study 
of science in school, information on the types of science classes in which eighth-grade 
students were enrolled and the amount of time spent each week on science instruction 
was collected. As noted for Table 3.2, in which school principals answered a similar 
question concerning the frequency of science instruction, students in schools with block 
scheduling were not identified separately. Consequently, students under block 
scheduling who receive science instruction two or three times weekly may be receiving 
as much instruction as students in traditional settings who have science every day. 

Based on students’ responses shown in Table 4.2: 

• In eighth grade, 1 percent of the students in Montana reported not taking 
a science course this year. This did not differ significantly from the 
national percentage (3 percent). 

• In Montana, the average scale score for students taking life science (153) 
was lower than that of students taking physical science (166). 

• The average scale score for Montana students taking life science (153) 
was not significantly different from that of students taking earth science 
(156). 

• In Montana, 90 percent of the students reported studying science three 
or more times a week. The average scale score for students who 
reported studying science three or more times a week in Montana (163) 
was higher than that of students studying at this level nationwide (152). 
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Council of Chief State School OfRcers. State Indicators of Science and Mathematics Education. (Washington, DC' 
CCSSO, 1995). 
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TABLE 4.2 


Public School Students’ Reports on Their Science Classes 


State Assessment 



Montana 


West 


Nation 


Percentage and Average Scale Score 



Which best describes the science 
course you are taking? 








1 am not taking science this year. 


1 ( 0.4) 


4 ( 0.4) 


3 ( 0.9) 




- r*) 


117(6.7) 


120 ( 3.0)! 


Life science 


10 ( 1.1) 


14 ( 2.8) 


12 ( 1.5) 




153 ( 2.9) 


135(4.1)! 


133 ( 3.5) 


Physical science 


58 ( 2.8) 


16 ( 4.9) 


25 ( 2.6) 




166 ( 1.3) 


150 ( 2.4)1 


154(1.6) 


Earth science 


14 ( 2.8) 


23 ( 6.2) 


23(3.1) 




156 ( 3.9) 


153 ( 8.8)1 


148 ( 3.6) 


General science 


11 ( 1.6) 


24 ( 2.4) 


19 ( 1.5) 




164(2.1) 


157 ( 1.9) 


156 ( 1.7) 


Integrated science 


5(0.7) 


20 ( 3.3) 


17(1.8) 




160 ( 2.9) 


152 ( 2.5) 


156(1.6) 


About how often do you study 
science In school? 








Never 


2 ( 0.4) 


4 ( 0.4) 


4 ( 0.5) 




- rn 


133 ( 4.7) 


126 ( 3.2) 


Less than once a week 


2 ( 0.3) 


3 ( 0.4) 


4 ( 0.3) 




- rn 


132 (7.2) 


136 ( 3.0) 


1 or 2 times a week 


6 ( 0.6) 


8(1.9) 


7 ( 0.8) 




158 ( 3.1) 


140 ( 4.7)1 


138 ( 2.6) 


3 or 4 times a week 


7 ( 0.9) 


16(5.2) 


13(1.9) 




157 ( 3.3) 


151 (3.8)1 


146 ( 2.2) 


Every day 


84 ( 1.2) 


70 ( 7.6) 


71 ( 2.7) 




164 ( 1.2) 


151 (3.4) 


153 ( 1.3) 



The NAEP science scale ranges from 0 to 300. The standard enors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In conq)aring two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
deter minati on of the variability of this statistic. *** Sample size is insufficient to permit a reliable estimate. 

SOURCE: National Center for Education Statistics, Natick Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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Instructional Emphasis 

The framework that guided the development of the NAEP 1996 science assessment 
identified three ways of knowing and doing science — conceptual unders tandin g, 
scientific investigation, and practical reasoning.^ In addition, the science education 
reform effort has focused heavily on students’ ability to co mmuni cate their 
understanding of science to others.’’ To assess students’ opportunities to learn and 
communicate the knowledge and skills outlined in the framework, teachers were asked 
about their plans for science instruction during the entire year. Their responses are 
shown in Table 4.3. 

• In Montana, the percentage of eighth-grade students whose teachers 
reported they platmed to give moderate emphasis to knowing science 
facts and terminology (63 percent) was greater than that of students 
whose teachers plaimed heavy emphasis on knowing facts and 
terminology (33 percent). 

• The average scale score of students whose teachers gave moderate 
emphasis to knowing facts and terminology (161) was not significantly 
different from that of students whose teachers heavily emphasized this 
topic (163). 

• Less than one fifth of the students in Montana (14 percent) had teachers 
who reported they planned to place moderate emphasis on the 
understanding of key science concepts by their students. This 
percentage was smaller than that of students whose teachers plaimed 
heavy emphasis on conceptual understanding (86 percent). 

• The average scale score of students whose teachers planned moderate 
emphasis on the understanding of science concepts (161) was not 
significantly different from that of students whose teachers placed heavy 
emphasis on this topic (163). 

• In Montana, the percentage of eighth-grade students whose teachers 
reported they planned to give moderate emphasis to developing science 
problem-solving skills (31 percent) was smaller than that of students 
whose teachers planned heavy emphasis on this topic (69 percent). 

• Teachers of 57 percent of the students in Montana reported that they 
planned to place moderate emphasis on knowing how to communicate 
ideas in science effectively, greater than the percentage of students 
whose teachers reported giving this topic heavy emphasis (35 percent). 



50 

Science Framework for the 1996 National Assessment of Educational Progress. (Washington, DC: National Assessment 
Governing Board, 1993). 

American Association for the Advancement of Science. Benchmarks for Science literacy. (New Yoric: Oxford 
University Press, 1993); National Research Council. National Science Education Standards. (Washington, DC: National 
Academy Press, 1996). 
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TABLE 4.3 


Public School Teachers* Reports on Instructional Emphasis 



Think about your plans for your 
science instruction during the entire 


Montana 


West 


Nation 


year. About how much emphasis wiii 








you give to the following as an 
objective for your students? 


Percentage and Average Scale Score 



Kntming science facts and terminology 








Little or no emphasis 


4(1.8) 


8 ( 5.9) 


5(2.3) 




- (-•) 


156 (2.5)1 


154(4.0)1 


Moderate emphasis 


63 ( 3.5) 


56(5.1) 


57 ( 3.4) 




161 ( 1.5) 


149 (2.6) 


153(1.4) 


Heavy emphasis 


33(3.1) 


36 ( 7.3) 


38 ( 3.9) 




163(1.8) 


149(6.7)1 


145 ( 2.6) 


Understanding key science concepts 








Little or no emphasis 


or***) 


or-*) 


0 (****) 




- rn 


- (-.*) 


*** (**.*) 


Moderate emphasis 


14(2.6) 


7(2.0) 


11 (2.4) 




161 (2.3) 


143(3.2)! 


143 ( 2.4)1 


Heavy emphasis 


86 ( 2.6) 


93 ( 2.0) 


89 ( 2.5) 




163(1.4) 


150(2.7) 


151 ( 1.2) 


Developing science problem-solving skins 








Little or no emphasis 


1 r**) 


or***) 


3(1.6) 




^ rn 


-* (**.*) 


140 (20.9)1 


Moderate emphasis 


31 ( 4.2) 


25 ( 4.8) 


28 ( 3.7) 




161 (1.8) 


149 ( 7.0)1 


148 ( 3.4) 


Heavy emphasis 


69(4.2) 


74 ( 4.7) 


69 ( 4.3) 




163 ( 1.6) 


150(2.0) 


152 ( 1.3) 


Knowing how to communicate ideas 
in science effectively 








Little or no emphasis 


7 ( 2.6) 


21 ( 5.9) 


16 ( 3.3) 




162(2.6)1 


152(2.3)1 


151 (2.7)1 


Moderate emphasis 


57 ( 3.8) 


33 ( 5.8) 


42 ( 4.3) 




162(1.9) 


150(7.1)1 


149 ( 2.3) 


Heavy emphasis 


35 ( 2.8) 


46 ( 6.4) 


42 ( 4.4) 




164 ( 1.4) 


148 ( 2.4) 


151 ( 1.5) 



The NAEP science scale ranges from 0 to 300. The standard eirois of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the saiiq)le does not allow accurate 
determination of the variability of t)^ statistic. *** Sample size is insufficient to permit a reliable estimate. 
**** Standard error estimates cannot be accurately determined. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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With the explosion of the information age, mainstream news and the Internet 
afford opportunities to access up-to-date scientific information. Science instruction 
could benefit by taking advantage of such opportunities. To determine if these 
opportunities were being explored, eighth-grade teachers and students were asked how 
often they have classroom discussions about science stories that appear in the news. 
The results are presented in Table 4.4. 

• In Montana, 53 percent of eighth-grade students were taught by teachers 
who reported frequent (once a week or more) classroom discussions of 
science in the news. A small percentage of the students (6 percent) had 
teachers who reported never or hardly ever discussing science in the 
news. 

• When students were asked how often they discussed science in the news, 

31 percent reported frequent discussions while 45 percent reported 
never or hardly ever discussing it. 
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TABLE 4.4 



Public School Teachers’ and Students’ Reports on 
Discussions of Science in the News 



How often do your students (do you) 
discuss science in the news? 



Montana 


West 


Nation 


Teacher 


Student 


Teacher 


Student 


Teacher 


Student 


Percentage and Average Scale Score 



Never or hardly ever 


6 ( 2.6) 
162 ( 3.5)l 


45(1.8) 
160 ( 1.5) 


9(4.6) 
143 (2.4)1 


CM CM 


8 ( 2.6) 
155(7.6)1 


44 ( 1.3) 
144 ( 1^) 


Once or twice a month 


41 ( 3.9) 
162 ( 1.4) 


24 ( 0.9) 
166 ( 1.7) 


46 ( 8.9) 
153 (4.8)1 


22(1.1) 
157 (2.3) 


44 ( 4.9) 
150(2.1) 


22(1.1) 
155 ( 1.9) 


Once or twice a week 


44 ( 5.2) 
162 ( 2.2) 


22 ( 1.6) 
164 (2.1) 


29 ( 5.4) 
148 (4.1)1 


21 ( 1.5) 
151 (2.7) 


33 ( 2.9) 
149 ( 2.0) 


22 ( 0.9) 
154 ( 1.8) 


Almost every day 


8(2.1) 
169 ( 2.9)1 


9 ( 0.8) 
157(2.8) 


16 ( 7.7) 
146 (3.7)! 


10 (0.8) 
141 (2.9) 


16(4.9) 
153 ( 3.8)1 


11 (1.1) 
147 ( 2.8) 



The NAH* science scale ranges firom 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for Ae entire population is within ± 2 
standard errors of the estimate for the sample. In coix^aring two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of Ae variability of this statistic. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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Science Homework 

Past NAEP science assessments have shown a positive relationship between 
science homework and performance.^ To examine the relationship between homework 
and science scale scores in Montana, the teachers of the assessed smdents were asked 
to report the amount of science homework they assigned each week, and smdents were 
asked to report the amount of time they spent on science homework, each week. 

Tables 4.5 and 4.6 show the teachers’ and smdents’ responses for eighth-grade 
public school smdents in Montana. (Smdents had an additional response choice “I am 
not taking a science course this year,” but no analogous option was available to 
teachers.) According to the teachers’ responses: 

• In Montana, 2 percent of the eighth graders were not assigned science 
homework each week. In addition, 87 percent of the smdents were 
assigned an hour or more of homework each week. 

• The percentage of smdents in Montana whose teachers assigned an hour 
or more of homework each week (87 percent) was not significantly 
different from the corresponding national percentage (86 percent). 
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Public School Teachers' Reports on Homework in Science 


Stats Assessment 



About how much time do you 


Montana 


West 


Nation 


expect a student in this class to 
spend doing homework each week? 


Percentage and Average Scale Score 



None 


2 ( 0.6) 


3(1.4) 


2 ( 0.8) 




*** (**.*) 


139 ( 4.0)! 


134 ( 4.5)l 


1/2 hour 


12 ( 1.9) 


15 ( 4.6) 


12 ( 2.3) 




160(2.5) 


144(3.4)1 


142 ( 3.3)1 


1 hour 


34(4.2) 


46 ( 6.4) 


42(4.1) 




163 ( 1.5) 


154 ( 5.2) 


152(2.1) 


2 hours 


42 ( 3.6) 


29 ( 6.7) 


28 ( 4.4) 




164 ( 1.8) 


148 ( 3.3)l 


152 ( 3.0) 


More than 2 hours 


11 (3.5) 


6(2.7) 


15 ( 4.8) 




155 ( 5.5)! 


151 (3.9)l 


156 ( 3.9)l 



The NAEP science scale ranges from 0 to 300. The standard eirois of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estinoate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
detennination of the variability of this statistic. *** Sample size is insufficient to permit a reliable estimate. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



best copy AVAILABIE 




Jones, L.R, I.V.S. Mullis, S.A. Raizen, LR. Weiss, and E.A. Weston. The 1990 Science Report Card: NAEP*s 
Assessment of Fourth, Eighth, and Twelfth Graders. (Washington, DC: National Center for ^ucadon Statistics, 1992). 
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The eighth-grade students’ reports indicated that: 

• About one fifth of the eighth graders did not spend any time on science 
homework in a typical week (19 percent) wWle 39 percent spent one 
hour or more on their science homework each week. 

• The percentage of students in Montana who spent an hour or more on 
homework each week (39 percent) was somewhat greater than the 
percentage of students nationwide spending this much time on 
homework each week (34 percent). 
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TABLE 4.6 


Public School Students’ Reports on Homework in Science 


State Assessment 



If you are taking science this year, 


Montana 


West 


Nation 


about how much time do you spend 
doing science homework each week? 


Percentage and Average Scale Score 



1 am not taking a science 
course this year. 


1 ( 0.4) 


4 ( 0.7) 


4 ( 0.9) 




*** 


136 ( 5.8)» 


127(3.1)1 


None 


19(2.1) 


28 ( 3.2) 


22(1.5) 




162(2.3) 


148(2.1) 


147 ( 1.6) 


1/2 hour 


41 ( 1.5) 


37 ( 2.5) 


40 ( 1.4) 




163(1.6) 


152(2.7) 


151 ( 1.1) 


1 hour 


21 ( 1.3) 


18(1.1) 


19(0.7) 




163(1.6) 


144(2.5) 


148 ( 1.6) 


2 hours 


9 ( 0.6) 


7 ( 0.8) 


8 ( 0.5) 




164 ( 2.5) 


155(7.7) 


156(2.7) 


3 hours 


5 ( 0.6) 


3(0.7) 


3 ( 0.4) 




163 ( 3.4) 


162 (7.2)! 


157(3.1) 


More than 3 hours 


4(0.5) 


4 ( 0.5) 


4 ( 0.4) 




153 ( 3.3) 


140 ( 6.7) 


152 ( 3.5) 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within + 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of this statistic. *** Sample size is insufficient to permit a reliable estimate. 

SOURCE: National Center for Education Statistics, Natio^ Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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In addition to being asked about science homework in general, students were asked 
how often they use a computer at home for schoolwork. Because the question was not 
restricted to science homework, students’ reports most likely included homework for 
other academic areas such as English and mathematics. Given the trend that home 
computers are steadily assuming more importance for completing homework 
assigmnents,^^ it seems useful that NAEP monitor the prevalence of this practice and its 
relationship to performance. 

Based on the reports of eighth graders in Montana, as shown in Table 4.7: 

• About one third of the students reported that there was no computer at 
home (33 percent) and another 16 percent reported never or hardly ever, 
using their home computer to do homework. 

• About one third of the eighth graders reported using their home 
computer to do homework almost every day (13 percent) or once or 
twice a week (19 percent). 

• The average scale score for students who used a computer almost every 
day for homework (166) was not significantly different from that of 
students who never or hardly ever did so (161). 

• The average scale score for students who used a computer almost every 
day for homework (166) was not significantly different from that of 
students who used a computer at home once or twice a month (169). 
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TABLE 4.7 
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Public School Students^ Reports on Using Computers at 
Home 


State Assessment 



How often do you use a computer 


Montana 


West 


Nation 


at home for sehoohnork? 


Percentage and Average Scale Score 



There is no computer at home. 


33(1.5) 


32(1.8) 


36(1.2) 




155 ( 2.0) 


139 ( 1.9) 


143 ( 1.0) 


Never or hardly ever 


16 ( 0.9) 


18(1.8) 


17 ( 0.9) 




161 (2.0) 


144 ( 2.4) 


144(1.6) 


Once or twice a month 


20 ( 0.9) 


17(1.3) 


15(0.5) 




169(1.4) 


160(2.2) 


160(1.8) 


Once or twice a %veek 


19(1.0) 


18(1.6) 


17(1.1) 




167 ( 2.0) 


159 ( 3.4) 


157(1.9) 


Almost every day 


13(1.0) 


15(1.1) 


15 ( 0.7) 




166 (2.4) 


154(3.7) 


154 (1.9) 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses, h can be said 
with about 95 percent confidence that, for each population of interest, the value for ^e entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



U.S. Department of Education. Digest of Education Statistics 1995. (Washington, DC: National Center for Education 
Statistics, 1995). 
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Computer Use in Science instruction 

The use of computers in the collection of data, interpretation of results, and 
communication of findings is part of the Benchmarks for Science Literacy and the 
recently published National Science Education Standards.^ Recommendations for 
facilitating science instruction in the nation’s schools often include more use of 
computers. Computers can be used to demonstrate scientific concepts, simulate 
scientific phenomena, deliver instruction, and collect and analyze data. Of course, 
effective computer use may depend on many factors other than availability, such as 
teachers’ training or whether computers have been incorporated into the curriculum 
effectively. 

Computers are increasingly important in students’ homes, where they are used for 
homework as well as for other pursuits. Since 1984 the percentage of students in grades 
7 through 12 who use a computer at school or at home has increased over two-fold, to 
approximately 60 percent of students using a computer at school and 30 percent using 
one at home.’’ 

Given the potential role of computers in science instruction, NAEP asked teachers 
in Montzuia about the availability and use of computers in science instruction. As 
presented in Table 4.8, when eighth-grade science teachers in Montana were asked about 
the availability of computers, their responses indicated the following: 

• In Montruia, 8 percent of the students were in science classes where 
computers were not available. This percentage was smaller than that for 
the nation (17 percent). 

• The average scale score of Montana students whose teachers reported 
not having any computers available (162) was not significantly different 
from that of students whose teachers reported having one computer in 
the classroom (163). 

• The average scale score of students in Montana whose teachers reported 
that no computers were available (162) was not significantly different 
from* that of students whose teachers reported having four or more 
(154). 

• The average scale score of students in Montana whose teachers reported 
having four or more computers within the classroom (154) was lower 
than that of students whose teachers reported having easy access to a 
computer lab (167). 



* Although the difference may appear large, recall that “significance'' here refers to “statistical significance." 

^ American Association for the Advancement of Science. Benchmarks for Science Literacy. (New Yoric: Oxford 
University Press, 1993); National Research Council. National Science Education Standards. (Washington, DC: National 
Academy Press, 1996). 

U.S. Department of Education. Digest of Education Statistics J995. (Washington, DC: National Center for Education 
Statistics, 1995). 
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TABLE 4.8 


Public School Teachers* Reports on the Availability of 
Computers 


1996 
State Ass 


essment 



Which best describes the 


Montana 


West 


Nation 


avaiiability of computers for use by 
your science students? 


Percentage and Average Scale Score 



None available 


8 ( 2.6) 


15(5.2) 


17(3.4) 




162 ( 1.8)! 


157 (14.5)! 


149 ( 5.6)! 


One within the classroom 


26 ( 3.5) 


21 ( 4.8) 


22 ( 4.8) 




163 (3.1) 


149 ( 3.3)! 


149 ( 3.2)! 


Two or three within the classroom 


5(1.8) 


8(4.5) 


9 ( 4.6) 




164 (7.2)! 


*** (*•.*) 


156 ( 7.2)! 


Four or more within the classroom 


7 ( 3.0) 


16(7.4) 


7 ( 3.0) 


Available in a computer laboratory 


154(4.0)! 


159 ( 3.5)1 


159 (2.8)! 


but difficult to access or schedule 


40 ( 4.6) 


24 ( 8.9) 


32 ( 4.9) 


Available In a computer laboratory 


161 (2.2) 


147 ( 3.9)1 


149 (2.1) 


and easy to access or schedule 


14(1.7) 


15 (3.6) 


13 (2.6) 




167 ( 1.8) 


146(4.2)! 


148 (2.4) 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In con^aring two estimates, one must use t^ standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of this statistic. *** Sample size is insufficient to permit a reliable estimate. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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The availability of computers varies from school to school, and the uses for 
computers can vary widely from class to class. Computers can be used in many ways 
to help students leam science, including simulating scientific phenomena or illustr ating 
models. Also, the frequency of use can vary, regardless of the primary use in the 
classroom. Teachers in Montana were asked how they used computers and how often 
they were used in their science classrooms. Also, students were asked how often they 
used computers when doing science in school. The responses of eighth-grade public 
school teachers to the purpose of use for science instmction, as shown in Table 4.9, 
indicate the following: 

• The percentage of Montana students whose teachers reported that they 
used computers for simulations and modeling (25 percent) was not 
significantly different from the corresponding national percentage 
(26 percent). 

• The percentage of students in Montana whose teachers reported that their 
use of computers for instmction in science was for data analysis and 
other applications (31 percent) was not significantly different from* that 
of students nationwide (20 percent). 

• Less than half of the eighth graders had teachers who reported not using 
a computer for science instmction (37 percent). This percentage did 
not differ significantly from* the percentage for the nation (46 percent). 

Table 4.10 presents teacher and student reports on the frequency of use of 
computers for science. 

• In Montana, 61 percent of the students had teachers who reported never 
or hardly ever using a computer with their classes, while a small 
percentage reported doing so almost every day (0 percent) or once or 
twice a week (7 percent). 

• In Montana, 63 percent of the students reported never or hardly ever 
using computers to do science in school. Furthermore, 3 percent of the 
students reported using computers almost every day and 9 percent used 
them once or twice a week. 



* Although the difference may appear large, recall that “significance” here refers to “statistical significance.” 
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TABLE 4.9 



Public School Teachers’ Reports on the Use of Computers for 
Instruction in Science 



How do you use computers for 


Montana 


West 


Nation 


instruction in science? 


Percentage and Average Scale Score 



Drill and practice 


11 (2.7) 


6 ( 3.0) 


8(4.4) 




165 (2.7)1 


- rn 


155 ( 6.8)1 


Playing science/leaming games 


14(1.9) 


23 ( 5.9) 


20 ( 3.8) 




162 (3.0) 


150(3.4)1 


150 ( 3.9) 


Simulations and modeling 


25 ( 3.8) 


37 ( 8.5) 


26 ( 5.5) 




164 (2.5) 


152 (2.2)1 


153 ( 2.4)1 


Data analysis and other applications 


31 (4.2) 


18(6.1) 


20 ( 3.5) 




162 ( 1.9) 


147 ( 2.3)! 


149 ( 1.6) 


Word processing 


39 ( 4.0) 


18(4.4) 


22 ( 3.5) 


1 do not use computers for 


164 ( 1.7) 


143 ( 6.1)1 


152 ( 2.2) 


science instruction. 


37(4.1) 


38 ( 6.3) 


46 (4.2) 




161 (2.4) 


153 ( 4.6) 


149 (2.1) 



The NAEP science scale ranges firom 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for ^e entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of this statistic. *** Sample size is insufficient to permit a reliable estimate. 

SOURCE: National Center for Education Statistics, Naticnial Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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TABLE 4.10 



Public School Teachers’ and Students’ Reports on the 
Frequency of Computer Use 



How often tto your students (do you) 
use a computer for science? 


Montana 


West 


Nation 


Teacher 


Student 


Teacher 


Student 


Teacher 


Student 


Percentage and Average Scale Score 



Never or hardly ever 


61 ( 4.8) 
162 ( 1.5) 


63 ( 2.0) 
162(1.5) 


59 ( 8.6) 
152 (3.9)1 


' 68(2.9) 
151 (2.4) 


62 ( 4.3) 
150(1.8) 


67 ( 1 .8) 
150( 1.1) 


Once or twice a month 


32 ( 4.8) 
162(2.1) 


25 ( 1.3) 
165 ( 1.8) 


33 ( 8.0) 
145(1.9)1 


19(1.4) 

152(1.9) 


31 ( 4.0) 
151 (22) 


18(1.1) 
154 ( 1.9) 


Once or twice a week 


7(1.6) 
164 (2.5)1 


9(1.1) 
159 (2.9) 


7(4.1) 
158 (7.4)! 


10(1.6) 
142 (4.7) 


7(2.4) 

156(4.0)1 


10(1.0) 
145 ( 2.9) 


Almost every day 


0 (•***) 
- (-.*) 


3 ( 0.4) 
156(4.1) 


1 ( 0.8) 


3 ( 0.6) 
122 (6.1) 


0 ( 0.3) 

-r*) 


5 ( 0.5) 
135(3.6) 



The NAEP science scale ranges fiom 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for ^e entire population is within ± 2 
standard errors of the estimate for the sample. In con^aring two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of tMs statistic. *** Saix^le size is insufficient to permit a reliable estimate. 
**** Standard error estimates cannot be accurately determined. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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CHAPTER 5 



Student Performance on Hands-On Science 
Tasks 

A number of goals for science education have been put forward in a series of 
reports authored by government agencies and professional societies over the last 15 
years.“ These goals include acquisition of a core of scientific understanding, ability to 
apply science knowledge in practical ways, familiarity with experimental design, and the 
ability to carry out scientific experiments. The reports also offered recommendations 
for the science curricula and instruction needed to achieve these goals, such as 
encouraging active student participation in hands-on science, incorporating cooperative 
group learning, and assignment of sustained projects to students.” 

A 1993 national survey indicated that science teachers devote 21 to 26 percent 
of class time to hands-on or manipulative activities.’* While research on the relationship 
between exposure to hands-on science tasks and overall science performance is sparse 
and inconclusive, a recent study has demonstrated a positive relationship for 
eighth-grade students between the frequency of hands-on activities and their perform^ce 
on a standardized assessment” 



National Science Board Conunission on Precollege Education in Mathematics, Science, and Technology. Educating 
America for the 2 2 St Century. (Washington, DC: National Science Foundation, 1983); Science for All Americans: A 
Project 2061 Report on literacy Goals in Science, Mathematics, and Technology. (Washington, DC: American 
Association for the Advancement of Science, 1989); Aldridge, B.G. Essential Changes in Secondary School Science: 
Scope, Sequence, and Coordination. (Washington, DC: National Science Teachers Association, 1989); Natioiuil 
Research Council. Fulfilling the Promise: Biology Education in the Nation* s Schools. (Washington, DC: National 
Academy Press, 1990). 

^ Science Framework for the 1996 National Assessment of Educational Progress. (Washington, DC: National Assessment 
Governing Board, 1993). 

Blank, R.K. and D. Gniebel. State Indicators of Science and Mathematics Education. (Washington, DC: Council of 
Chief State School Officers, 1995). 

Stohr-Hunt, P.M. “An Analysis of Frequency of Hands-On Experience and Science Achievement” Journal of Research 
on Science Teaching, 33. (1996, pp. 101-109). 
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NAEP included assessments of higher-order thinking skills in science and 
mathematics as early as 1986 through a pilot assessment that required students to work 
on various hands-on tasks. Although the NAEP 1990 science assessment measured 
skills that were integral to scientific investigation,* hands-on tasks were not included. 
When the 1996 science framework®' was developed in the early 1990s, it took into 
account the current reforms in science education by specifying three question types that 
probed understanding of conceptual and reasoning skills: performance exercises, 
constructed-response questions, and multiple-choice questions. It was envisaged that in 
the performance exercises, students would manipulate selected physical objects and try 
to solve a scientific problem using the objects before them. Hands-on tasks that met 
these criteria were developed for the 1996 science assessment, and each student who 
participated in the assessment was given an opportunity to conduct one of them. 

NAEP Hands-On Science Tasks 

Four different hands-on tasks were administered in the NAEP 1996 science 
assessment. Each task was designed to use materials to perform an investigation, make 
observations, evaluate experimental results, and apply problem-solving skills. In 
addition, tasks shared the following characteristics: 

• Diagrams were included to guide students through the procedures; 

• Multiple-choice and constructed-response questions were embedded 
throughout the tasks; and 

• Scientifrc investigation was integrated with conceptual understanding 
and practical reasoning. 

The creation of the hands-on tasks presented special challenges. Since the 
assessment was administered in a variety of settings, ranging from laboratories to 
cafeterias, all of the required equipment necessary to conduct each task had to be 
provided in a self-contained kit produced according to standard specifications to ensure 
uniformity. There were some limitations on materials and equipment. For example, live 
materials (with the exception of seeds) and equipment that required an electric outlet 
were not used. Safety was also an important concern and was addressed in a number 
of ways. The state’s safety regulations were considered; no toxic or corrosive chemicals 
were used; assessment administrators were trained in appropriate laboratory safety; and 
students were provided with goggles for some tasks. 

A brief summary of one of the four hands-on tasks is described in this chapter. 
Several questions from the hands-on task are also shown with their scoring criteria. 



^ Science Objectives: 1990 Assessment. (Princeton, NJ: The National Assessment of Educational Progress, 1989). 

Science Framework for the 1996 National Assessment of Educational Progress. (Washington, DC: National Assessment 
Governing Board, 1993). 
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Sample Questions from a Task 



A brief summary of one of the four tasks given to grade 8 students in Montana 
is presented below with sample questions in Figures 5.1 and 5.2. 

Salt Solutions: Estimating the Salt Concentration of an Unknown Salt Solution 
Using the *Tloating Pencil Test” 

An instrument constructed from a pencil and thumbtack served as a hydrometer in this 
task. Students were asked to observe, measure, and compare the lengths of a portion 
of the pencil, marked with calibrations for ease of measurement, that floated above 
the surface in distilled water and in a 25 percent salt solution. Based on these 
observations, students were asked to predict how the addition of more salt to the salt 
solution would affect the floating pencil. Students then measured the length of the 
pencil that floated above the surface of a solution of unknown salt concentration and 
used the results of their previous observations to estimate the salt concentration of the 
unknown solution. The task assessed students’ ability to make simple observations, 
measure length using a ruler, apply observations to an unknown, draw a graph, 
interpolate from graphical data, and make a generalized inference from observations. 

The task also assessed students’ understanding of the value of performing multiple 
trials of the same procedure. 

Figure 5.1 shows a data table that was presented in the fibrst stage of the task. Questions 
3, 4, and 5 are also presented in this figure. Students were asked to measure the length 
of pencil floating above the surface in three solutions: distilled water, a 25 percent salt 
solution, and a solution containing an unknown concentration of salt. The students 
recorded two measurements for each of the 3 solutions in Table 1 and calculated the 
average of each pair of readings. The scoring rubrics for Complete responses are shown 
in Figure 5.1. 
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FIGURE 5.1 




Salt Solutions Task: Questions 3, 4, and 5 


Stats Asscssmenl 


[ 



3. Now take the pencil out of the water and dry it with a paper towel. 
Use the ruler to measure the length of the pencil that was above the 
water. Record the length in Table 1 below under Measurement 1. 



TABLE 1 





Type of 
Solution 


Length of Pencil Above Water Surface (cm) 


Measurement 

1 


Measurement 

2 


Average 




Distilled Water 










Salt Solution 










Unknown Salt 
Solution 











4. Now place the pencil back in the distilled water and repeat steps 2 and 3. 
Record your measurement in Table 1 under Measurement 2. 

5. Calculate the average of Measurements 1 and 2 and record the result 
in the data table. 

(You can calculate the average by adding Measurement 1 + Measurement 2 
and then dividing by two.) 

SCORING RUBRIC 

Measurement: A Complete response has three pairs of measurements that agree 
within a given tolerance and also are in the correct relative order. 

Average: A Complete response correctly calculates the average for each set of data. 
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Students were then presented with graph paper and asked to plot the average of 
the measurements for distilled water and 25 percent salt solution against salt 
concentration. They were told to assume a linear relationship between the height of the 
pencil above the solution and the salt concentration, and then asked to use the graph to 
determine the salt concentration of the unknown solution (Figure 5.2). The sco ring 
rubric for a Complete response is also shown. 
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FIGURE 5.2 


-4 


Salt Solutions Task: Question 14 


State Assessment 



14. Based on the graph that you plotted, what is the salt concentration of 

the unknown solution? 

Explain how you determined your answer. 



SCORING RUBRIC 

Unknown Solution: A Complete response gave a salt concentration consistent with 
the graph and correctly explained how the graph was used to obtain the answer. 



Instruction Related to Scientific Investigation 

Research devoted to the effectiveness of hands-on tasks is ongoing, although there 
is evidence that eighth graders who are exposed to hands-on activities more frequently 
perform better on standardized assessments.^ Eighth-grade science teachers in Montana 
were asked about the emphasis they placed on laboratory skills and data analysis in their 
science classes and about the frequency and nature of hands-on activities or 
investigations assigned by them. Students were asked about the frequency and nature 
of hands-on activities or investigations conducted by them. 

As mentioned before, a direct cause-and-effect relationship between educational 
environment and student scores on the NAEP science assessment is not implied. For 
instance, the motivation and expectations of teachers or students reporting hands-on 
investigations hardly ever or once or twice a week may be a factor in the average score 
differences. However, responses to teacher (and school) questionnaires provide a broad 
view of educational practices that should prove useful for improving instruction and 
setting policy. 



^ Stohr-Hunt, P.M. ”An Analysis of Frequency of Hands-On Experience and Science Achievement” Journal of Research 
on Science Teaching, 33. (1996, pp. 101-109). 
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Teachers’ and students’ responses regarding scientific investigation are presented 
in Tables 5.1 through 5.5. 

• The percentage of eighth-grade students in Montana whose teachers 
reported placing heavy emphasis on the development of laboratory skills 
and techniques (48 percent) was not significantly different ftom the 
percentage nationwide (42 percent). Students whose teachers reported 
heavy emphasis on laboratory skills and techniques in Montana had an 
average scale score (163) which was higher than that of students 
nationwide whose teachers reported this (153). 

• The percentage of eighth-grade students in Montana whose teachers 
reported moderate to heavy emphasis on the development of data 
analysis skills (91 percent) was not significantly different from that of 
students nationwide (89 percent). Eighth-grade students whose teachers 
reported moderate to heavy emphasis on data analysis skills had an 
average science scale score (162) which did not differ significantly from 
that of students whose teachers reported little or no emphasis on the 
development of data analysis skills (162). 
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TABLE 5.1 



Public School Teachers’ Reports on Science Instruction 
Related to Petfonnance Tasks 



Think about your plans for your science 
instruction during the entire year. About how 
much emphasis will you give to each of the 
following? 



Montana 


West 


Nation 


Percentage and Average Scale Score 



Developing laboratory skilts and techniques 
as an obfeetive for your students 
Little or no emphasis 


6(2.5) 
162 (3.0)! 


7(2.1) 
135 (5.3)! 


13 ( 2.5) 
135 ( 3.6)1 


Moderate emphasis 


47 ( 4.4) 
162(1.5) 


41 ( 9.5) 
149 ( 2.9)1 


44 ( 4.7) 
152 ( 2.0) 


Heavy emphasis 


48 ( 4.8) 
163(1.9) 


52 ( 9.0) 
152 ( 4.9)1 


42 ( 4.5) 
153 (2.1) 


Developing data analysis skilts 
Little or no emphasis 


9 ( 2.0) 
162(3.5)1 


4(1.9) 

136(4.1)1 


11 (2.7) 
139 ( 5.5)! 


Moderate emphasis 


65 ( 4.0) 
162 (1.4) 


75 ( 4.4) 
150 ( 2.3) 


65 ( 5.3) 
151 ( 1.6)^ 


Heavy emphasis 


26 ( 3.6) 
165 (2.6) 


in 
o o 

CM lO 


24 ( 4.3) 
153 ( 3.0) 



The NAEP science scale ranges finom 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of this statistic. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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• About two thirds of the eighth-grade students in Montana (70 percent) 
had teachers who reported doing a science demonstration at least once 
a week, not significantly different from* the percentage of students 
nationwide (59 percent). About half of eighth-grade students in 
Montana (52 percent) reported that their teacher performed science 
demonstrations at least once a week. 

• The eighth-grade students in Montana whose teachers reported doing a 
science demonstration at least once a week had an average scale score 
(163) which was higher than that of their national counterparts (151). 
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TABLE 5.2 




Public School Teachers* and Students* Report on the 
Frequency of Science Demonstrations 


State Assessment 





Montana 


West 


Nation 


How often do you (does your teacher) 
do a science demonstration? 


Teacher 


Student 


Teacher 


Student 


Teacher 


Student 




Percentage and Average Scale Score 





Never or hardly ever 




23 ( 1.4) 
156 ( 1.9) 


1 (0.7) 


30 ( 2.5) 
141 (2.9) 


2 ( 0.8) 
149 (11.6)1 


30(1.3) 
141 ( 1.5) 


Once or twice a month 


27 ( 4.3) 
161 (2.2) 


25 ( 1.8) 
159 ( 1.7) 


40 ( 6.9) 
151 ( 1.6)! 


29 ( 1.7) 
152 (2.8) 


39(4.1) 

150(2.0) 


29(1.1) 
151 ( 1.3) 


Once or twice a week 


60 ( 4.5) 
162 ( 1.4) 


35 ( 1.2) 
168 ( 1.4) 


44 ( 5.3) 
151 (5.6) 


27 ( 1.5) 
155(2.6) 


49 ( 3.5) 
152 ( 1.9) 


28 ( 1.2) 
156(1.4) 


Almost every day 


10 ( 1.0) 
168 (2.7) 


17(1.4) 
164 ( 2.3) 


15 ( 6.0) 
144 (2.4)1 


15(1.6) 
150 ( 3.4) 


10(2.3) 
144 ( 2.0)1 


14 ( 0.9) 
153(2.0) 



Hie NAEP science scale ranges from 0 to 300. Hie standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for ^ entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
differ^ce (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of this statistic. *** Sample size is insufficient to permit a reliable estimate. 

SOURCE: National Center for Education Statistics, Natick Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



* Although the difference may appear large, recall that “significance” here refers to “statistical significance.” 
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• The percentage of eighth-grade students in Montana whose teachers 
reported their science students performed hands-on tasks once a week 
or more (80 percent) was not significantly different from the national 
percentage (83 percent). 

• The eighth-grade students in Montana whose teachers reported their 
students did hands-on tasks at least once a week had an average science 
scale score (164) which was higher than that of students nationwide 
whose teachers reported this same level of hands-on task experience 
(153). 

• The eighth-grade students in Montana whose teachers reported their 
students did hands-on tasks almost every day had an average scale score 
(165) which did not differ significantly from* that of students whose 
teachers reported doing hands-on activities once or twice a month (158). 
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Public School Teachers* and Students* Reports on the 


essmeni 


*095 
State Ass 


t 


Frequency of Hands-on Activities or Investigations 



How often do your students (do you) 
do hands-on activities or 


Montana 


West 


Nation 


Teacher 


Student 


Teacher 


Student 


Teacher 


Student 


investigations in science? 


Percentage and Average Scale Score 



Never or hardly ever 


1 (****) 


9(1.5) 


1 (0.2) 


17(1.7) 


1 ( 0.6) 


18(1.1) 




(**.*) 


142 ( 2.3) 


-*r.*) 


137 (2.4) 


119 (4.0)1 


134 ( 1.2) 


Once or twice a month 


20 ( 3.5) 


36 ( 2.0) 


5(1.9) 


34 ( 2.8) 


16 ( 2.4) 


32(1.5) 




158 ( 3.7) 


163 ( 1.1) 


129 ( 5.5)! 


152 (3.3) 


140 ( 3.4) 


152 ( 1.5) 


Once or twice a week 


56 ( 4.6) 


38 ( 1.6) 


70 ( 7.0) 


32(1.8) 


64 ( 3.5) 


33(1.3) 




163(1.4) 


166 ( 1.4) 


151 ( 3.4) 


153(2.6) 


153 ( 1.5) 


155 ( 1.2) 


Almost every day 


23 ( 4.5) 


18(2.1) 


25 ( 6.5) 


17(1.6) 


19 ( 3.2) 


18(1.1) 




165 ( 2.3)1 


163 ( 2.3) 


150 ( 2.2)1 


150(2.7) 


152 ( 2.2) 


151 ( 1.5) 



The NAEP science scale ranges from 0 to 300. The standard enors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is wi thin ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 
determination of the variability of tl^s statistic. *** Sample size is insufficie nt to permit a reliable estimate. 
**** Standard enor estimates cannot be accurately determined. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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• About three quarters of the eighth-grade students in Montana 
(79 percent) had teachers who reported assigning science projects in 
school which take a week or more to complete. More than half of the 
students (63 percent) reported receiving such assignments and their 
average scale score was 163. 

• The average scale score of students who reported doing science projects 
or investigations that take a week or more (163) was not significantly 
different from that of students who did not (161). 
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TABLE 5.4 


Public School Teachers* and Students* Reports on Long-Term 
Science Projects 


Do you ever assign (do) individual 
or group science projects or 
investigations in school that take a 
week or more? 


Montana 


West 


Nation 


Teacher Student 


Teacher Student 


Teacher Student 


Percentage and Average Scale Score 




Yes 

No 


79 ( 4.1) 63 ( 2.3) 

163(1.4) 163(1.2) 

21 ( 4.1) 37 ( 2.3) 

160 ( 2.4) 161 ( 2.0) 


90 ( 3.4) 69 ( 2.5) 

151 (2.6) 151 (2.2) 

10(3.4) 31 (2.5) 

141 (4.6)! 144(3.3) 


82 ( 2.6) 63 ( 2.8) 

151(1.3) 151(1.3) 

18(2.6) 37(2.8) 

147 ( 3.4) 146(1.7) 



The NAEP scale ranges fiom 0 to 300. The standard errors of the statistics appear in parentheses. It can said 

with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the sample does not allow accurate 

determination of the variability of this statistic. c ■ 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Saence 
Assessment 
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• In Montana, the eighth-grade students who reported designing and 
carrying out their own scientific investigations once a week or more 
frequently (10 percent) received an average scale score of 156. 

• The average scale score for Montana students who reported designing 
and carrying out their own science investigations once a week or more 
(156) was lower than that for students who reported doing this once or 
twice a month (165). 
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TABLE 5.5 



Public School Students’ Reports on Independent Science 
Investigations 



When you study science in schoot, how often 
do you design and carry out your own 
science investigations? 



Montana 


West 


Nation 


Percentage and Average Scale Score 



Never or hardly ever 


68 ( 1.4) 
162(1.4) 


62(1.5) 
151 ( 2.4) 


63(1.1) 
151 ( 1.0) 


Once or twice a month 


21 ( 1.2) 
165(1.7) 


25(1.2) 
152 ( 2.2) 


23 ( 0.8) 
151 ( 1.3) 


Once or twrice a week 


7(0.7) 

154(3.1) 


9 ( 0.8) 
137 ( 2.9) 


10 ( 0.6) 
142 ( 2.3) 


Almost every day 


3 ( 0.4) 
159 (4.1) 


4 ( 0.8) 
128(4.1) 


5 ( 0.4) 
137 ( 2.5) 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for frie entire population is within ± 2 
standard errors of the estimate for the sample. In conq)aring two estimates, one must use the standard error of the 
difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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CHAPTER 6 



Influences Beyond School that Facilitate 
Learning Science 

The home environment can be an important support for the school environment. 
To examine the relationship between science scale scores and home factors, data 
regarding students’ responses to questions about home factors and principals’ responses 
to questions about parental involvement in the school were examined. The student 
questionnaires also asked students how often they had changed schools because of 
household moves to exa mi ne the impact of student mobility on academic achievement. 

Students’ attitudes toward science can influence their performance in the 
assessment. For example, in a recent large scale science assessment, students who 
agreed that science learning is useful for the future and that science should be required 
in school performed better than those who disagreed with these statements.® These 
attitudes toward science may be attributed to factors within the school and external 
influences. The beliefs and general impressions that secondary school students form 
about science can affect not only their performance in assessments but also then- 
decisions about pursuing scientiQc careers in the future.** 



® Campbell, J.R., C.M. Reese, C. O’Sullivan, and J.A. Dossey. NAEP 1994 Trends in Academic Progress. (Washington, 
DC: National Center for Education Statistics, 1996). 

54 

Gallagher, S.A. “Middle School Classroom Predictors of Science Persistence.” Journal of Research in Science 
Teaching, 1994, 33. pp. 721-734. 
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Discussing Studies at Home 

The importance of schoolwork for students and their families can by measured 
by how often it is discussed at home. When students discuss academic work at home, 
they create an important link between home and school. Recent NAEP assessments in 
various subject areas have found a positive relationship between discussing studies at 
home and student performance.^ 

The NAEP 1996 assessment asked students to report on how frequently they 
discuss schoolwork at home. As shown in Table 6.1, the results for eighth graders 
attending public schools in Montana indicate that: 

• Less than half of the eighth graders (43 percent) said they discussed 
their schoolwork at home almost every day. This percentage was greater 
than the percentage who said they never or hardly ever had such 
discussions (19 percent). 

• The average scale score for students who discussed their schoolwork 
almost every day (164) was higher than that for students who never or 
hardly ever did so (156). 
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TABLE 6.1 


Public School Students^ Reports on Discussing Studies at 
Home 


State Assessmenl 


t 



How often do you discuss things 


Montana 


West 


Nation 


you have studied in school with 
someone at home? 


Percentage and Average Scale Score 



Never or hardly ever 


19 ( 1.0) 


19 ( 0.9) 


21 ( 0.8) 




156(2.3) 


141 ( 2.6) 


141 ( 1.5) 


Once or twice a month 


11 (0.8) 


9(1.0) 


9 ( 0.4) 




165 ( 2.0) 


147 ( 2.8) 


149(1.6) 


Once or twice a week 


28 ( 1.4) 


29 ( 1.8) 


28(1.0) 




163 ( 1.6) 


151 (2.2) 


151 ( 1.3) 


Almost every day 


43 ( 1.6) 


43 ( 1.8) 


41 (1.1) 




164 ( 1.5) 


153 ( 2.9) 


153(1.2) 



The NAEP science scale ranges from 0 to 300. The standard enors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use tlte standard erxxn* of the 
difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



Campbell, J.R., P.L. Donahue, C.M. Reese, and G.W. Phillips. NAEP 1994 Reading Report Card for the Nation and 
the States. (Washington, DC: National Center for Education Statistics, 1996); Beatty, A.S., C.M. Reese, H.R. Persky, 
and P. Carr. NAEP 1994 U.S. History Report Card. (Washington, DC: National Center for Education Statistics, 1996); 
Persky, H.R., C.M. Reese, C.Y. O'Sullivan, S. Lazer, J. Moore, and S. Shakrani. NAEP 1994 Geography Report 
Card. (Washington, DC: National Center for Education Statistics, 1996). 
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Literacy Materials in the Home 

Students can learn much about science by reading materials outside the classroom. 
For example, scientific information can often be found in mainstream newspaper and 
magazine articles. Also, the availability of reading and reference materials at home may 
be an indicator of the value placed on learning by the parents.^ In recent NAEP 
assessments, a positive relationship has been reported between print materials in the 
home and average scale scores.^ 

The NAEP science assessment asked students whether their families had more than 
25 books, an encyclopedia, a newspaper, or any magazines in their home. Table 6.2 
shows the percentages of eighth-grade public school students reporting that their families 
have all four types, only three types, or two or fewer types of these literacy materials. 
The table also presents students’ corresponding average scale scores. Based on their 
responses: 

• About half of the students in Montana (53 percent) reported having all 
four types of literacy materials in their homes. This percentage was 
greater than the percentage for the nation (47 percent). 

• The percentage of students in Montana reporting having two or fewer 
types of these materials (17 percent) was smaller than the percentage 
having all four types (53 percent). The percentage having two or fewer 
types was smaller than the percentage for the nation (24 percent), 

• The average science scale score for students in Montana with all four 
types of literacy materials (167) was higher than that for students with 
two or fewer types (153). 



er|c 



^ Rogoff, B. Apprenticeship in Thinking: Cognitive Development in Social Context. (New York: Oxford University Press, 
1990). 

^ Campbell, J.R., P.L. Donahue, C.M. Reese, and G.W. Phillips. NAEP 1994 Reading Report Card for the Nation and the 
States. (Washington, DC: National Center for Education Statistics, 1996); Beatty, A.S., C.M. Reese, H.R. Persky, and 
P. Carr. NAEP 1994 U.S. History Report Card. (Washington, DC: National Center for Education Statistics, 1996); 
Persky, H.R., C.M. Reese, C.Y. O’Sullivan, S. Lazer, J. Moore, and S. Shakrani. NAEP 1994 Geography Report 
Card. (Washington, DC: National Center for Education Statistics, 1996). 
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TABLE 6.2 


Public School Students* Reports on Literacy Materials in the 


State Assessment 


Home 



How many of the following types of 
reading materials are in your home 


Montana 


West 


Nation 


(more than 25 books, an 








encyclopedia, a newspaper, 
magazines)? 


Percentage and Average Scale Score 



Zero to two 


17(1.0) 


26 ( 1 .9) 


24 ( 0.7) 




153(2.0) 


131 ( 2.3) 


132 ( 1.2) 


Three 


30(1.1) 


31 ( 1.2) 


29 ( 0.8) 




159 ( 1.7) 


149 ( 1.6) 


149 ( 1.0) 


Four 


53(1.2) 


43 ( 2.5) 


47(1.1) 




167 ( 1.3) 


160 ( 2.2) 


158 ( 1.2) 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment. 



Television Viewing Habits 

Past NAEP assessments have shown that more than 40 percent of eighth-grade 
students reported watching four or more hours of television each day. A major concern 
is that watching television reduces the time spent on homework and related academic 
activities. Although the effects of such extensive television exposure are difficult to 
document, a generally negative relationship exists between NAEP score results and 
number of television hours watched.^ The recent TIMSS assessment shows a similar 
pattern for most countries. In general, beyond one to two hours of daily television 
viewing, the more that eighth graders reported watching, the lower their science 
achievement.'* 

Students were asked how much television (including videotapes) they usually 
watched each school day. The results for eighth-grade public school students in 
Montana are shown in Table 6.3 and indicate the following: 



3 

Campbell, J.R., P.L. Donahue, C.M. Reese, and G.W. Phillips. NAEP 1994 Reading Report Card for the Nation and the 
States. (Washington, DC: National Center for Education Statistics, 1996); Beatty, A.S., C.M. Reese, H.R. Persky, and 
P. Carr. NAEP 1994 U.S. History Report Card. (Washington, DC: National Center for Education Statistics, 1996); 
Persky, H.R., C.M. Reese, C.Y. O’Sullivan, S. Lazer, J. Moore, and S. Shakrani. NAEP 1994 Geography Report 
Card. (Washington, DC: National Center for Education Statistics, 1996); Campbell, J.R., C.M. Reese, C.Y. O'Sullivan, 
and J.A. Dossey. NAEP 1994 Trends in Academic Progress. (Washington, DC: National Center for Education Statistics, 
1996). 

4 

Beaton, A.E., M.O. Martin, I.V.S. Mullis, E.J. Gonzalez, T.A. Smith, and D.L. Kelly. Science Achievement in the Middle 
School Years: lEA *s Third International Mathematics and Science Study (TIMSS). (Chestnut Hill, MA: TIMSS 
International Study Center at Boston College, 1996). 
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• Among eighth graders, 9 percent reported watching six or more hours 
of television on a typical day. This percentage was smaller than the 
percentage who reported watching one hour or less (23 percent). 

• The percentage of eighth graders in Montana who reported watching six 
or more hours of television a day (9 percent) was smaller than the 
percentage for the nation (17 percent). 

• The average science scale score for eighth-grade students who reported 
watching two to three hours of television a day (164) was not 
significantly different from that for students who reported watching one 
hour or less (167). 

• The average science scale score for eighth graders who reported 
watching two to three hours of television a day (164) was higher than 
that for students who reported watching six hours or more (148). 
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TABLE 6.3 



Public School Students' Reports on Television Viewing Habits 



Montana 


West 


Nation 


Percentage and Average Scale Score 



On a schapi day, about how many 
hours do you usually watch TV or 
videotapes outside of school hours? 



1 hour or less 


23(1.0) 


21 (1.1) 


19 ( 1.0) 




167(1.8) 


155 (3.1) 


156 (2.0) 


2 to 3 hours 


49 (1.3) 


44 ( 1 .5) 


40 ( 1.3) 




164(1.3) 


152 (2.4) 


154 ( 1.2) 


4 to 5 hours 


19 (1.0) 


21 (1.1) 


24 ( 0.6) 




158(1.9) 


146 ( 2.2) 


146 ( 1.0) 


6 hours or more 


9(0.7) 


15(1.3) 


17(0.7) 




148 ( 3.5) 


134 (2.1) 


130(1.1) 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In conq>aring two estimates, one must use the standard eiior of the 
difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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Parental Support 

When parents are involved in their children’s education, both children and parents 
are likely to benefit. Research on students at risk has shown that parents’ participation 
in their child’s education has more effect on the child’s performance than parents’ 
income or education.™ Parental involvement is naturally part of the home environment, 
but it is also increasingly sought in the school. 

As part of the NAEP assessment, the principals of participating students were 
asked about parental involvement in their schools. Table 6.4 presents the results for 
eighth graders in public schools in Montana. According to these results: 

• Overall, almost all of the eighth-grade students attended schools where 
principals characterized parental support as very positive (35 percent) 
or somewhat positive (58 percent). 

• The average scale score for eighth graders attending school where 
parental support was characterized as very positive (165) was higher 
than that for the 7 percent of students whose principals reported 
somewhat to very negative parental support (144). 
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TABLE 6.4 


Public Schools^ Reports on Parental Support 



How would you characterize 


Montana 


West 


Nation 


parental support for student 
achievement within your school? 


Percentage and Average Scale Score 



Somewhat to very negative 


7(3.0) 


6 ( 3.3) 


7(2.6) 




144(7.4)! 


(-..) 


154 (2.1)! 


Somewhat positive 


58 ( 4.9) 


62 (10.3) 


61 ( 5.6) 




162(1.2) 


147 ( 2.4) 


148 ( 1.4) 


Very positive 


35 ( 4.0) 


32(8.5) 


31 (4.7) 




165(1.4) 


153(7.1)1 


151 (3.3) 



The NAEP science scale ranges firom 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 9S percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the san^le. In comparing two estimates, one must use tte standard error of the 
difference (see Appendix A for details). ! Interpret with caution — the nature of the san^)le does not allow accurate 
determination of ih& variability of this statistic. *** Sample size is insufficient to permit a reliable estimate. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



U.S. Department of Education. Mapping out the National Assessment of Title I: The Interim Report — J996. 
(Washington, DC: Office of Educational Research and Improvement, 1996). 
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Student Mobility 

The United States has long been a nation “on the move.” Research indicates that 
moving more than once or twice during a school career lowers student performance. 
Students who attend the same school throughout their careers are most likely to graduate, 
while the most mobile of the school populations have the highest rates of failure and 
dropping out. The effects of high mobility are far-reaching; schools with high mobility 
rates depress performance even for students who do not move.” 

To examine the relationship between mobility and science performance, the NAEP 
assessment asked students how many times since starting first grade they had changed 
schools due to changes in where they lived. Table 6.5 shows results for eighth-grade 
public school students in Montana. 

• In terms of student mobility, 43 percent of eighth graders reported not 
moving since starting first grade while 6 percent of students reported 
moving six or more times. The students with the highest reported 
mobility had an average scale score (155) that was lower than Aat of 
students who reported not moving (166). 

• The percentage of students in Montana who reported moving six or more 
times (6 percent) was not significantly different from the percentage for 
the nation (6 percent). 
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TABLE 6.5 



Public School Students’ Reports on Mobility 



Since you started first grade, how 
many times have you changed 
schoois, not counting when you 
were promoted to the next grade? 



Montana 


West 


Nation 


Percentage and Average Scale Score 



None 


43(1.5) 


37(1.3) 


44 ( 1.2) 




166 ( 1.7) 


152 ( 2.4) 


153 ( 1.3) 


One 


19(1.0) 


20 ( 0.8) 


19 ( 0.8) 




163 ( 1.9) 


154 ( 2.9) 


154 ( 1.4) 


Two 


10 ( 0.7) 


10(0.7) 


10 ( 0.4) 




161 (2.3) 


144 ( 3.0) 


145 ( 1.4) 


Three 


11 (0.7) 


14(1.2) 


11 ( 0.6) 




156 (2.4) 


145 ( 4.6) 


141 (2.3) 


Four or five 


11 ( 0.8) 


12 ( 0.8) 


10 ( 0.5) 




158 ( 2.3) 


149 ( 2.9) 


142 ( 1.7) 


Six or more 


6(0.7) 


7 ( 0.6) 


6 ( 0.3) 




155 ( 3.4) 


140 ( 3.4) 


141 (2.0) 



The NAEP science scale ranges &om 0 to 300. The standard errors of the statistics appear in parentheses. It can be said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In conqiaring two estimates, one must use the standard error of the 
difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 



ERIC Clearinghouse on Urban Education. Highly Mobile Students: Educational Problems and Possible Solutions. (New 
Yoric ERIC Clearinghouse on Urban Education, ERIC/CUE Digest, Number 73, 1991). 

URL: ht^://www.ed.gov/databases/ERIC_Digests/ed338745.html. See also The Condition of Education 
1995/indicator46 at URL: http://www.ed.gov/NCES/pubs/ce/c9546a01.html. 
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Students’ Views About Science 

Science educators have been interested in the relationship between students’ 
attitudes and student performance for several decades. A considerable body of research 
has shown a correlation between students’ attitudes and their performance in science, 
with positive attitudes typically being associated with higher performance.” Therefore, 
the 1996 NAEP science assessment asked several questions to gauge students’ attitudes 
toward science. Table 6.6 shows the responses for eighth graders in Montana. 

• In Montana, 43 percent of eighth graders agreed that science is useful 
for solving everyday problems. The average scale score for these 
students (167) was higher than that for students who were unsure about 
this statement or who did not agree with it (159). 

• In Montana, 32 percent of the students agreed that learning science is 
mostly memorizing facts. The average scale score for eighth graders 
who felt that learning science is mostly memorizing (163) was not 
significantly different from the average scale score of students who were 
unsure or disagreed with this statement (162). 



THENAnOyS 

CARD 

State Assessment 


TABLE 6.6 


Public School Students’ Views About Science 




How much do you agree with the 
following statements? 


Montana West Nation 


Percentage and Average Scale Score 



Science Is useful for solving 
everyday problems. 

Disagree 


21 (1.3) 


28 ( 1.6) 


25 ( 1.0) 


155 ( 1.8) 


140 ( 2.9) 


139(1.5) 


Not sure 


36 ( 0.8) 


35 ( 1.3) 


35(0.7) 




161 ( 1.9) 


152 ( 2.0) 


150 ( 0.9) 


Agree 


43(1.3) 


37 ( 1.4) 


40 ( 1.1) 




167 ( 1.3) 


153 ( 3.0) 


155 ( 1.1) 


Learning science is mostly 
memorizing. 

Disagree 


33(1.4) 


30 ( 0.8) 


30 ( 0.8) 




165 ( 1.4) 


150 ( 2.4) 


150(1.3) 


Not sure 


35 ( 1.2) 


39 ( 0.6) 


37 ( 0.5) 




159 ( 2.0) 


147 ( 2.5) 


148(1.1) 


Agree 


32 ( 1.2) 


31 ( 0.9) 


33 ( 0.9) 


163(1.4) 


150 ( 2.3) 


149 ( 1.1) 



The NAEP science scale ranges from 0 to 300. The standard errors of the statistics appear in parentheses. It can Iw said 
with about 95 percent confidence that, for each population of interest, the value for the entire population is within ± 2 
standard errors of the estimate for the sample. In comparing two estimates, one must use the standard error of the 
difference (see Appendix A for details). . 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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^ Weinburg, M. “Gender Differences in Student Attitudes Toward Science: A Meta Analysis of the Literature from 1970 
to 1991.” Journal of Research in Science Teaching 1985, 32. pp. 387-398. 
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APPENDIX A 



Reporting NAEP 1996 Science Results 

A.1 Participation Guidelines 

As was discussed in the Introduction, imless the overall participation rate for a 
jurisdiction is sufficiently high, the assessment results for that jurisdiction may be 
subject to appreciable nonresponse bias. Moreover, even if the overall participation rate 
is high, significant nonresponse bias may exist if the nonparticipation that does occur 
is heavily concentrated among certain types of schools or students. The following 
guidelines concerning school and student participation rates in the state assessment 
program were established to address four significant ways in which nonresponse bias 
could be introduced into the jurisdiction sanq)le estimates. 

The first three guidelines describe the determination of whether a jurisdiction is 
eligible to have its results published. Guidelines 4-11 describe conditions under which 
a jurisdiction’s published results will include a notation. Such a notation would indicate 
the possibility of bias in particular results, due to nonresponse from segments of the 
sample. Note that in order for a jurisdiction’s results to be published without notations, 
that jurisdiction must comply with all guidelines. (A thorough discussion of the NAEP 
participation guidelines can be found in the Technical Report of the NAEP 1996 State 
Assessment Program in Science.) 

Guidelines on the Publication of NAEP Results 



Guideline 1 — Publication of Public School Results 
A jurisdiction will have its public school results published in the NAEP 1996 
Science Report Card (or in other reports that include all state-level results) if 
and only if its weighted participation rate for the initial sample of public 
schools is greater than or equal to 70 percent. Similarly, a jurisdiction will 
receive a separate NAEP 1996 Science State Report if and only if its weighted 
participation rate for the initial sample of public schools is greater than or 
equal to 70 percent 
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Guideline 2 — Publication of Nonpublic School Results 
A jurisdiction will have its nonpublic school results published in the NAEP 
1996 Science Report Card (or in other reports that include all state-level 
results) if and only if its weighted participation rate for the initial sample of 
nonpublic schools is greater than or equal to 70 percent AND meets minimum 
sample size requirements.' A jurisdiction eligible to receive a separate NAEP 
1996 Science State Report under guideline 1 will have its nonpublic school 
results included in that report if and only if that jurisdiction’s weighted 
participation rate for the initial sample of nonpublic schools is greater than 
or equal to 70 percent AND meets minimum sample size requirements. If a 
jurisdiction meets guideline 2 but fails to meet guideline 1, a separate NAEP 
1996 Science State Report will be produced containing only nonpublic school 
results. 

r 

Guideline 3 — Publication of Combined Public and 
Nonpublic School Results 

A jurisdiction will have its combined results published in the NAEP 1996 
Science Report Card (or in other reports that include all state-level results) if 
and only if both guidelines 1 and 2 are satisfied. Similarly, a jurisdiction 
eligible to receive a separate NAEP 1996 Science State Report under 
guideline 1 will have its combined results included in that report if and only 
if guideline 2 is also met. 

Guidelines for Notations of NAEP Results 



Guideline 4 — Notation for Overall Public School 
Participation Rate 

A jurisdiction that meets guideline 1 will receive a notation if its weighted 
participation rate for the initial sample of public schools was below 85 percent 
AND the weighted public school participation rate after substitution was below 
90 percent. 



Guideline 5 — Notation for Overall Nonpublic School 
Participation Retie 

A jurisdiction that meets guideline 2 will receive a notation if its weighted 
participation rate for the initial sample of nonpublic schools was below 85 
percent AND the weighted nonpublic school participation rate after 
substitution was below 90 percent. 
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^ Minimum participation size requirements for reporting nonpubUc school data consist of two components: (1) a school 
sample size of six or more participating schools and (2) an assessed student sample size of at least 62. 
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Guideline 6 — Notation for Strata-Specific Public School 
Participation Rate 

A jurisdiction that is not already receiving a notation under guideline 4 will 
receive a notation if the sample of public schools included a class of schools 
with similar characteristics that had a weighted participation rate (after 
substitution) of below 80 percent, and from which the nonparticipating schools 
together accounted for more than five percent of the jurisdiction’s total 
weighted sample of public schools. The classes of schools from each of which 
a jiuisdiction needed minimum school participation levels were determined 
by degree of urbanization, minority enrollment, and median household income 
of the area in which the school is located. 

Guideline 7 — Notation for Strata-Specific Nonpublic School 
Participation Rate 

A jiuisdiction that is not already receiving a notation under guideline 5 will 
receive a notation if the sample of nonpublic schools included a class of 
schools with similar characteristics that had a weighted participation rate (after 
substitution) of below 80 percent, and from which the nonparticipating schools 
together accounted for more than five percent of the jurisdiction’s total 
weighted sample of nonpublic schools. The classes of schools from each of 
which a jurisdiction needed minimu m school participation levels were 
determined by type of nonpublic school (Catholic versus non-Catholic) and 
location (metropolitan versus nomnetropolitan). 

Guideline 8 — Notation for Overall Student Participation 
Rate in Public Schools 

A jurisdiction that meets guideline 1 will receive a notation if the weighted 
student response rate within participating public schools was below 85 percent. 

Guideline 9 — Notation for Overall Student Participation 
Rate in Nonpublic Schools 

A jurisdiction that meets guideline 2 will receive a notation if the weighted 
student response rate within participating nonpublic schools was below 
85 percent. 
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Guidettne 10 — Notation for Strata-Specific Student Participation 
Rates in Public Schools 

A jurisdiction that is not already receiving a notation under guideline 8 will 
receive a notation if the sampled students within participating public schools 
included a class of students with similar characteristics that had a weighted 
student response rate of below 80 percent, and from which the nonresponding 
students together accounted for more than five percent of the jurisdiction’s 
weighted assessable public school student sample. Student groups from which 
a jurisdiction needed minimum levels of participation were determined by the 
age of the student, whether or not the student was classified as a student with 
a disability (SD) or of limited English proficiency (LEP), and the type of 
assessment session (monitored or unmonitored), as well as school level of 
urbanization, minority enrollment, and median household income of the area 
in which the school is located. 

Guideline 11 — Notation for Strata-Specific Student Participation 
Rates in Nonpublic Schools 

A jurisdiction that is not already receiving a notation under guideline 9 will 
receive a notation if the sampled students within participating nonpublic 
schools included a class of students with similar characteristics that had a 
weighted student response rate of below 80 percent, and from which the 
nonresponding students together accounted for more than five percent of the 
jurisdiction’s weighted assessable nonpublic school student sample. Student 
groups from which a jurisdiction needed minimum levels of participation were 
determined by the age of the student, whether or not the student was classified 
as a student with a disability (SD) or of limited English proficiency (LEP), 
and the type of assessment session (monitored or unmonitored), as well as type 
and location of school. 



O 
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A.2 NAEP Reporting Groups 

The NAEP state assessment program provides results for groups of students 
defined by shared characteristics — region of the country, gender, race/ethnicity, parental 
education, type of school, and participation in federally funded Title I programs and the 
free/reduced-price lunch component of the National School Lunch Program. Based on 
criteria described later in this appendix, results are reported for subpopulations only 
when sufficient numbers of students and adequate school representation are present. 

For public school students, there must be at least 62 students in a particular subgroup 
from at least 5 primary sampling units (PSUs).^ For nonpublic school students, the 
minim u m requirement is 62 students in a particular subgroup from at least 6 different 
schools. However, the data for all students, regardless of whether their subgroup was 
reported separately, were included in computing overall results for Montana. Definitions 
of the subpopulations referred to in this report are presented on the following pages. 

Region 

Results are reported for four regions of the nation: Northeast, Southeast, Central, 
and West. The states included in each region are shown in Figure A.l. All 50 states 
and the District of Columbia are listed. Territories and the two Department of Defense 
Education Activity jurisdictions were not assigned to a region. 

Regional results are based on national assessment samples, not on aggregated state 
assessment program samples. Thus, the regional results are based on a different and 
separate sample from that used to report the state results. 



THE NAnON^S 


FIGURE A.1 


REPORT 
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Regions of the Country 



NORTHEAST 


SOUTHEAST 


CENTRAL 


WEST 











Connecticut 


Alabama 


Illinois 


Alaska 


Delaware 


Arkansas 


Indiana 


Arizona 


District of Columbia 


Florida 


Iowa 


California 


Maine 


Georgia 


Kansas 


Colorado 


Maryland 


Kentucky 


Michigan 


Hawaii 


Massachusetts 


Louisiana 


Minnesota 


Idaho 


New Hampshire 


Mississippi 


Missouri 


Montana 


New Jersey 


North Carolina 


Nebraska 


Nevada 


New York 


South Carolina 


North Dakota 


New Mexico 


Pennsylvania 


Tennessee 


Ohio 


Oklahoma 


Rhode Island 


Virginia* 


South Dakota 


Oregon 


Vermont 

Virginia* 


West Virginia 


Wisconsin 


Texas 

Utah 

Washington 

Wyoming 



Note: The part of Virginia that is included in the Washington, DC, metropolitan area is included in the Northeast region; 
the remainder of the state is in the Southeast region. 



^ For the State Assessment Program, a PSU is most often a single school; for the national assessment, a PSU is a selected 
geographic region (a county, group of counties, or metropolitan statistical area). 



THE NAEP 1996 STATE ASSESSMENT IN SCIENCE 



89 



Montana 



Gender 

Results are reported separately for males and females. 

Race/Ethnicitv 

The racial/ethnic results presented in this report attempt to provide a clear picture 
based on several sources. The race/ethnicity variable is an imputed definition of 
race/ethnicity derived from up to three sources of information. This variable is used for 
race/ethnicity subgroup comparisons. Two questions from the student demographics 
questionnaire were used in the determination of derived race/ethnicity: 



If you are Hispanic, what is your Hispanic background? 
° lam not Hispanic. 

° Mexican, Mexican American, or Chicano 
° Puerto Rican 
° Cuban 

° Other Spanish or Hispanic background 



Students who responded to this question by filling in the second, third, fourth, or 
fifth oval were considered Hispanic. For students who filled in the first oval, did not 
respond to the question, or provided information that was illegible or could not be 
classified, responses to the question below were examined in an effort to determine 
race/ethnicity. 



Which best describes you? 

° White (not Hispanic) 

° Black (not Hispanic) 

° Hispanic (“Hispanic” means someone who is from a Mexican, 
Mexican American, Chicano, Puerto Rican, Cuban, 
or other Spanish or Hispanic background.) 

° Asian or Pacific Islander (“Asian or Pacific Islander” 
means someone who is from a Chinese, Japanese, Korean, 

Filipino, Vietnamese, or other Asian or Pacific Island background.) 

° American Indian or Alaskan Native (“American Indian or 

Alaskan Native” means someone who is firom one of the American 
Indian tribes, or one of the original people of Alaska.) 

° Other (specify) 
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Students’ race/ethnicity was then assigned on the basis of their response. For 
students who filled in the sixth oval (“Other”) or provided illegible information or 
information that could not be classified, or did not respond at all, race/ethnicity was 
assigned as determined by school records.^ 

Derived race/ethnicity could not be determined for students who did not respond 
to either of the demographic questions and for whom a race/ethnicity designation was 
not provided by the school. 

The details of how race/ethnicity classifications are derived is presented so that 
the readers can determine the usefulness of the results for their particular uses. It should 
be noted that a nonnegligible number of students indicated a Hispanic background (e.g., 
Puerto Rican or Cuban) and indicated that a racial/ethnic category other than Hispanic 
best described them. These students were classified as Hispanic according to the rules 
described above. Also, information from the schools did not always correspond to 
students’ descriptions of themselves. 

Parents’ Highest Level of Education 

The variable representing level of parental education is derived from responses to 
two questions from the set of general background questions. Students were asked to 
indicate the extent of their mothers’ education: 



How far in school did your mother go? 

“ She did not finish high school. 

“ She graduated from high school. 

° She had some education after high school. 
“ She graduated from college. 

“ I don’t know. 



Students were asked a s imil ar question about their fathers’ education level: 



How far in school did your father go? 

° He did not finish high school. 

° He graduated from high school. 

° He had some education after high school. 
“ He graduated from college. 

° I don’t know. 



3 

The procedure for assigning race/ethmcity was modified for Hawaii. See the Technical Report for the NAEP 1996 State 
Assessment Program in Science for details. 
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This information was combined into one parental education reporting variable 
through the following procedure. If a student indicated the extent of education for only 
one parent, that level was included in the data. If a student indicated the extent of 
education for both parents, the higher of the two levels was included in the data. For 
students who did not know the level of education for both parents or did not know the 
level for one parent and did not respond for the other, the parental education level was 
classified as “I don’t know.” If the student did not respond for either parent, the student 
was recorded as having provided no response. 

It should be noted that, nationally, approximately one-tenth of eighth graders 
reported not knowing the education level of either of their parents. 

Type of School 

Samples for the 1996 state assessment program were expanded to include students 
attending nonpublic schools (Catholic schools and other religious and private schools) 
in addition to students attending public schools. The expanded coverage was instituted 
for the first time in 1994. Samples for the 1990 and 1992 Trial State Assessment 
programs had been restricted to public school students only. For those jurisdictions 
meeting pre-established participation rate standards (see earlier section of this appendix), 
separate results are reported for public schools, for nonpublic schools, and for the 
combined public and nonpublic school samples. The combined sample for each 
jurisdiction also contains students attending Bureau of Indian Affairs (BIA) schools and 
Department of Defense Domestic Dependent Elementary and Secondary Schools 
(DDESS) in that jurisdiction. These two categories of schools are not included in either 
the public or nonpublic school samples. 

Note that eighth graders in the DDESS and Department of Defense Dependents 
Schools (DoDDS)^ were assessed in 1996 as separate jurisdictions and reported as 
jurisdictions with public school samples only. 
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The Department of Defense Dependents Schools (DoDDS) refers to overseas schools (i.e., schools outside the United 
States). Department of Defense Domestic Dependent Elementary and Secondary Schools (DDESS) refers to domestic 
schools (i.e., schools in the United States). DoDDS and DDESS fourth grades were also assessed in science, for a 
special report 
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Title I Participation 

On the basis of available school records, students were classified either as 
currently participating in a Title I program or receiving Title I services, or as not 
receiving such services. The classification only refers to the school year when the 
assessment was administered (i.e., the 1995 — 96 school year) and is not based on 
participation in previous years. If the school did not offer any Title I programs or 
services, all students in that school were classified as not participating. 

Free/Reduced-Price School Lunch Program Eligibility 

On the basis of available school records, students were classified either as 
currently eligible for the Department of Agriculture’s free/reduced-price lunch program 
or not. The classification refers only to the school year when the assessment was 
administered (i.e., the 1995 — ^96 school year) and is not based on eligibility in previous 
years. If the school did not participate in the program or if school records were not 
available, all students in that school were classified as “Information not available.” 

A.3 Guidelines for Analysis and Reporting 

This report describes science performance for eighth graders and compares the 
results for various groups of students within this population — for example, those who 
have certain demographic charticteristics or who responded to a specific backgroimd 
question in a particular way. The report examines the results for individual demographic 
groups and individual background questions. It does not include an analysis of the 
relationships among combinations of these subpopulations or background questions. 

Drawing Inferences from the Results 

Because the percentages of students in these subpopulations and their average 
scale scores are based on samples — rather than on the entire population of eighth 
graders in a jurisdiction — the numbers reported are necessarily estimates. As such, they 
are subject to a measure of uncertainty, reflected in the standard error of the estimate. 
When the percentages or average scale scores of certain groups are compared, it is 
essential to take the standard error into account, rather than to rely solely on observed 
s imilar ities or differences. Therefore, the comparisons discussed in this report are based 
on statistical tests that consider both the magnitude of the difference between the 
averages or percentages and the standard errors of those statistics. 
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One of the goals of the science state assessment program is to estimate scale score 
distributions and percentages of students in the categories described in A.2 for the 
overall populations of eighth-grade students in each participating jurisdiction based on 
the particular samples of students assessed. The use of confidence intervals, based on 
the standard errors, provides a way to make inferences about the population average 
scale scores and percentages in a manner that reflects the uncertainty associated with the 
sample estimates. An estimated sample average scale score ± 2 standard errors 
approximates a 95 percent confidence interval for the corresponding population average 
or percentage. This means that one can conclude with approximately 95 percent 
confidence that the average scale score of the entire population of interest (e.g., all 
eighth-grade students in public schools in a jurisdiction) is within ± 2 standard errors 
of the sample average. 

As an example, suppose that the average science scale score of the students in a 
particular jurisdiction’s eighth-grade sample were 156 with a standard error of 1.2. A 
95 percent confidence interval for the population average would be as follows: 

Average ± 2 standard errors = 156 ± 2 x (1.2) = 156 ± 2.4 = 

156 - 2.4 and 156 + 2.4 = (153.6, 158.4) 

Thus, one can conclude with 95 percent confidence that the average scale score for the 
entire population of eighth-grade students in public schools in that jurisdiction is 
between 153.6 and 158.4. 

Similar confidence intervals can be constructed for percentages, if the percentages 
are not extremely large or extremely small. For extreme percentages, confidence 
intervals constructed in the above manner may not be appropriate, and accurate 
confidence intervals can be constructed only by using procedures that are quite 
complicated. 

Extreme percentages, defined by both the magnitude of the percentage and the size 
of the sample fix>m which it was derived, should be interpreted with caution. (The 
forthcoming Technical Report of the NAEP 1996 State Assessment Program in Science 
contains a more complete discussion of extreme percentages.) 
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AnalvTifip SiihjiroiiD Differences in Averages and Percentages 

The statistical tests determine whether the evidence, based on the data from the 
groups in the sample, is strong enough to conclude that the averages or percentages are 
actually different for those groups in the population. If the evidence is strong (i.e., the 
difference is statistically significant), the report describes the group averages or 
percentages as being different (e.g., one group performed higher than or lower than 
another group), regardless of whether the sample averages or sample percentages appear 
to be about the same or not. If the evidence is not sufficiently strong (i.e., the difference 
is not statistically significant), the averages or percentages are described as being not 
significantly different — again, regardless of whether the sample averages or sample 
percentages appear to be about the same or widely discrepant. The reader is cautioned 
to rely on the results of the statistical tests rather than on the apparent magnitude of the 
difference between sample averages or percentages when determining whether those 
sample differences are likely to represent actual differences between the groups in the 
population. 

In addition to the overall results, this report presents outcomes separately for a 
variety of important subgroups. Many of these subgroups are defined by shared 
characteristics of students, such as their gender or race/ethnicity. Other subgroups are 
defined by the responses of the assessed students’ science teachers to questions in the 
science teacher questionnaire. 

In Chapter 1 of this report, differences between the jurisdiction and the nation were 
tested for overall science scale score and for each of the fields of science. In Chapter 
2, significance tests were conducted for the overall scale score for each of the 
subpopulations. In Chapters 3 through 6, comparisons were made across subgroups for 
responses to various background questions. 

As an example of comparisons across subgroups, consider the question: Do 
students who reported discussing studies at home almost every day exhibit higher average 
science scale scores than students who report never or hardly ever doing so? 

To answer the question posed above, begin by comparing the average science scale 
score for the two groups being analyzed. If the average for the group that reported 
discussing their studies at home almost every day is higher, it may be tempting to 
conclude that that group does have a higher science scale score than the group that 
reported never or hardly ever discussing their studies at home. However, even though 
the averages differ, there may be no real difference in performance between the two 
groups in the population because of the uncertainty associated with the estimated average 
scale scores of the groups in the sample. Remember that the intent is to make a 
statement about the entire population, not about the particular sample that was assessed. 
The data from the sample are used to make inferences about the population as a whole. 
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As discussed in the previous section, each estimated sample average scale score 
(or percentage) has a degree of uncertainty associated with it. It is therefore possible 
that if all students in the population (rather than a sample of students) had been assessed 
or if the assessment had been repeated with a different sample of students or a different, 
but equivalent, set of questions, the performances of various groups would have been 
different. Thus, to determine whether there is a real difference between the average 
scale score (or percentage of students with a certain attribute) for two groups in the 
population, an estimate of the degree of uncertainty associated with the difference 
between the scale score averages or percentages of those groups must be obtained for 
the sample. This estimate of the degree of uncertainty — called the standard error of 
the difference between the groups — is obtained by taking the square of each group’s 
standard error, summing these squared standard errors, and then taking the square root 
of this sum. 

In a manner similar to that in which the standard error for an individual group 
average or percentage is used, the standard error of the difference can be used to help 
determine whether differences between groups in the population are real. The difference 
between the mean scale score or percentage of the two groups — 2 standard errors of 
the difference — represents an approximate 95 percent confidence interval. If the 
resulting interval includes zero, there is insufficient evidence to claim a real difference 
between groups in the population. If the interval does not contain zero, the difference 
between groups is statistically significant (different) at the 0.05 level. 

As another example, to determine whether the average science scale score of 
eighth-grade males is higher than that of eighth-grade females in a particular 
jurisdiction’s public schools, suppose that the sample estimates of the average scale 
scores and standard errors for males and females were as follows: 



Group 


Average Scale Score 


Standard Error 


Males 


148 


0.9 


Females 


146 


1.1 



The difference between the estimates of the average scale scores of males and females 
is two points (148 — 146). The standard error of this difference is 

V 0.9^ + 1.1" = 1.4 

Thus, an approximate 95 percent confidence interval for this difference is 
Mean difference ± 2 standard errors of the difference = 



2 ± 2 X (1.4) = 2 ± 2.8 = 2 - 2.8 and 2 + 2.8 = (-0.8, 4.8) 
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The value zero is within this confidence interval, which extends from -0.8 to 4.8 
(i.e., zero is between -0.8 and 4.8). Thus, there is insufficient evidence to claim a 
difference in average science scale score between the populations of eighth-grade males 
and females in public schools in the hypothetical jurisdiction. 

Throughout this report, when the average scale scores or percentages for two 
groups were compared, procedures like the one described above were used to draw the 
conclusions that are presented in the text.’ If a statement appears in the report indicating 
that a particular group had a higher (or lower) average scale score than a second group, 
the 95 percent confidence interval for the difference between groups did not contain 
zero. An attempt was made to distinguish between group differences that were 
statistically significant but rather small in a practical sense and differences that were both 
statistically and practically significant. A procedure based on effect sizes was used. 
Statistically significant differences that are rather small are described in the text as 
somewhat higher or somewhat lower. When a statement indicates that the average scale 
score or percentage of some attribute was not significantly different for two groups, the 
confidence interval included zero, and thus no difference could be tissumed between the 
groups. The reader is cautioned to avoid drawing conclusions solely on the basis of the 
ma gnitiidp. of the difference. A difference between two groups in the sample that 
appears to be slight may represent a statistically significant difference in the population 
because of the magnitude of the standard errors. Conversely, a difference that appears 
to be large may not be statistically significant. 

The procedures described in this section, and the certainty ascribed to intervals 
(e.g., a 95 percent confidence interval), are based on statistical theory that assumes that 
only one confidence interval or test of statistical significance is being performed. 
However, in each chapter of this report, many different groups are being compared (i.e., 
multiple sets of confidence intervals are being calculated). In sets of confidence 
intervals, statistical theory indicates that the certainty associated with the entire set of 
intervals is less than that attributable to each individual comparison from the set if 
considered individually. To hold the certainty level for the set of comparisons at a 
particular level (e.g., 0.95), modifications (called multiple comparison procedures) must 
be rrntAp to the methods described in the previous section. One such procedure — the 
Bonferroni method — was used in the analyses described in this report to form 
confidence intervals for the differences between groups whenever sets of comparisons 
were considered.® Using this method, the confidence intervals in the text that are based 
on sets of comparisons are more conservative than those described on the previous 
pages. In other words, some comparisons that were individually statistically significant 
us ing the methods previously described may not be statistically significant when the 
Bonferroni method was used to take the number of related comparisons into account. 



’ The procedure described above (especially the estimatioii of the standard error of the difference) is. in a strict sense, 
appropriate only when the statistics being compared come from independent samples. For certain comparisons in the 
report, the groups were not independent. In those cases, a different (and more appropriate) estimate of the standard 
error of the difference was used. 

^ Miller, R.G. Simultaneous Statistical Inference. (New Y<nic: McGraw-Hill, 1966). 
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Most of the multiple comparisons in this report pertain to relatively small sets or 
“families” of comparisons. For example, when comparisons were discussed concerning 
students’ reports of parental education, six comparisons were conducted — all pairs of 
the four parental education levels. In these situations, Bonferroni procedures were 
appropriate. However, the maps in Chapter 1 of this report display comparisons between 
Montana and all other participating jurisdictions. The “family” of comparisons in this 
case was as many as 46. To control the certainty level for a large family of comparisons, 
the False Discovery rate (FDR) criterion’ was used. Unlike the Bonferroni procedures 
which control the familywise error rate (i.e., the probability of making even one false 
rejection in the set of comparisons), the Benjamini and Hochberg (BH) approach using 
the FDR criterion controls the expected proportion of falsely rejected hypotheses as a 
proportion of all rejected hypotheses. Bonferroni procedures may be considered 
conservative for large families of comparisons.® In other words, using the Bonferroni 
method would produce more statistically nonsignificant comparisons than using the BH 
approach. Therefore, the BH approach is potentially more powerful for comparing 
Montana to all other participating jurisdictions. A more detailed description of the 
Bonferroni and BH procedures appears in the Technical Report of the NAEP 1996 State 
Assessment Program in Science. 

Statistics with Poorly Estimated .Standard Errors 

Not only are the averages and percentages reported in NAEP subject to 
uncertainty, but their standard errors are as well. In certain cases, typically when the 
standard error is based on a small number of students or when the group of students is 
enrolled in a small number of schools, the amount of uncertainty associated with the 
standard errors may be quite large. Throughout this report, estimates of standard errors 
subject to a large degree of uncertainty are followed by the symbol “!”. In such cases, 
the standard errors — and any confidence intervals or significance tests involving these 
standard errors — should be interpreted cautiously. Further details concerning 
procedures for identifying such standard errors are discussed in the Technical Report of 
the NAEP 1996 State Assessment Program in Science. 






7 ... 

Benjamini, Y. and Y. Hochberg. “Controlling the false discovery rate: A practical and powerfiil approach to multiple 
testing.” Journal of the Royal Statistical Society, Series B, 57(1). (pp. 289-300, 1994). 



8 

Williams, V.S.L., L.V. Jones, and J.W. Tukey. Controlling Error in Multiple Comparisons, with Special Attention to the 
National Assessment of Educational Progress, (Research Triangle Park, NC: National Institute of Statistical Sciences, 
December 1994). 
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Minimum Siihf^ouD Sample Sizes 

Results for science performance and background variables were tabulated and 
reported for groups defined by gender, race/ethnicity, parental education, type of school, 
and participation in federally funded Title I programs and the free/reduced-price school 
lunch component of the National School Lunch Program. NAEP collects data for five 
racial/ethnic subgroups (White, Black, Hispanic, Asian/Pacific Islander, and American 
Indian/Alaskan Native) and four levels of parents’ education (Graduated From College, 
Some Education After High School, Graduated From High School, and Did Not Finish 
High School) plus the category “I Don’t Know.” 

In many jurisdictions, and for some regions of the country, the number of students 
in some of these groups was not sufficiently high to permit accurate estimation of 
performance and/or background variable results. As a result, data are not provided for 
the subgroups with students from very few schools or for the subgroups with very small 
sample sizes. For results to be reported for any state assessment subgroup, public school 
results must represent at least 5 primary sampling units (PSUs) and nonpublic school 
results must represent at least 6 schools. For results to be reported for any national 
assessment subgroup, at least 5 PSUs must be represented in the subgroup. In addition, 
a minimum sample of 62 students per subgroup is required. For statistical tests 
pertaining to subgroups, the sample size for both groups has to meet the minimum 
sample size requirements. 

The minimum sample size of 62 was determined by computing the sample size 
required to detect an effect size of 0.5 total-group standard deviation units with a 
probability of 0.8 or greater. The effect size of 0.5 pertoins to the true difference 
between the average scale score of the subgroup in question and the average scale score 
for the total eighth-grade public school population in the jurisdiction, divided by the 
standard deviation of the scale score in the total population. If the true difference 
between subgroup and total group mean is 0.5 total-group standard deviation units, then 
a sample size of at least 62 is required to detect such a difference with a probability 
of 0.8. Further details about the procedure for determining minimu m sample size appear 
in the Technical Report of the NAEP 1996 State Assessment Program in Science. 
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Describing the Size of Percentages 

Some of the percentages reported in the text of the report are given qualitative 
descriptions. For example, the number of students currently taking a biology class mi g ht 
be described as “relatively few” or “almost all,” depending on the size of the percentage 
in question. Any convention for choosing descriptive terms for the magnitude of 
percentages is to some degree arbitrary. The descriptive phrases used in the report and 
the rules used to select them are shown below. 



Percentage 


Descriptive Term Used in Report 


p = 0 


None 


0 < p < 8 


A small percentage 


8 < p < 13 


Relatively few 


13<p<18 


Less than one fifth 


18<p<22 


About one fifth 


22 < p < 27 


About one quarter 


27 < p < 30 


Less than one third 


30 < p < 36 


About one third 


36 < p < 47 


Less than half 


47 < p < 53 


About half 


53 < p < 64 


More than half 


64<p<71 


About two thirds 


71 < p <79 


About three quarters 


79 < p < 89 


A large majority 


89 < p < 1 00 


Almost all 


p = 100 


All 
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The NAEP 1996 Science Assessment 

T^he science framework for the 1996 National Assessment of Educational 
Progress was produced under the auspices of the National Assessment Governing Board 
through a consensus process. The consensus process, managed by the Council of Chief 
State School Officers, with the National Center for Improving Science Education and 
the American Institutes for Research, developed the framework over a ten-month period 
between October 1990 and August 1991. The following factors guided the process for 
developing consensus on the science framework:’ 

• The active participation of individuals such as curriculum specialists, 
science teachers, science supervisors, state supervisors, administrators, 
individuals from business and industry, government officials, and 
parents; 

• The representation of what is considered essential learning in science, 
and the recommendation of innovative assessment techniques to probe 
the critical abilities and content areas; 

• The recognition of the lack of agreement on such things as common 
scope of instruction and sequence, components of scientific literacy, 
important outcomes of learning, and the nature of overarching themes 
in science. 



While maintaining some conceptual continuity with the 1990 NAEP Science 
Assessment, the 1996 framework takes into account the current reforms in science 
education, as well as documents such as the science framework used for the 1991 
International Assessment of Educational Progress. In addition, the Framework Steering 
Committee recommended that a variety of strategies, including the following, be used 
for assessing students’ performance.^ 



^ Science Framework for the 1996 National Assessment of Educational Progress, (Washington, DC: National Assessment 
Governing Board, 1993). 

^ Ibid. 
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• Performance tasks that allow students to manipulate physical objects and 
draw scientific understanding from the materials before them 

• Constructed-response questions that provide insights into students’ levels 
of understanding and ability to co mmuni cate in the sciences as well as 
their ability to generate, rather than simply recognize, information 
related to scientific concepts and their interconnections 

• Multiple-choice items that probe students’ conceptual imderstanding and 
ability to connect ideas in a scientifically sound way 

B.1 Percentage of Assessment Time by Domain 

The framework for the 1996 science assessment can be described as a 
two-dimensional matrix. The three fields of science (earth, physical, and life ) make 
up the first dimension and ways of knowing and doing science (conceptual 
understanding, scientific investigation, and practical reasoning) make up the second 
dimension. Every question or task in the assessment is classified according to the two 
major dimensions. There are also two overarching domains — nature of science (that 
includes nature of technology) and themes (systems, models, and patterns of change). 

In addition to describing the content of the assessment, the framework also 
recommends what percentage of time should be devoted to each field of science, each 
way of knowing and doing science, the nature of science, and themes. 

In this section, each figure describes an element of the framework, and is followed 
by a table showing the actual distribution of assessment time as well as the distribution 
recommended by the framework. Care was taken to ensure congmence between the 
proportions actually used in the assessment and those recommended in the assessment 
specifications. Note that the tables represent all three grades assessed nationally; only 
grade 8 was assessed at the state level. 

Figure B.l describes the fields of science and Table B.l shows the actual and 
recommended distribution of assessment time across each field. The ways of knowing 
and doing science are outlined in Figure B.2. The distribution of assessment time for 
this dimension, both actual and recommended, is depicted in Table B.2. 
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FIGURE B.1 




Description of the Three Fields of Science 


state Assessment 



Earth Science 

The earth science content assessed centers on objects and events that are relatively accessible or 
visible. The concepts and topics covered are solid Earth (lithosphere), water (hydrosphere), air 
(atmosphere), and the Earth in space. The solid Earth consists of composition; forces that alter its 
surface; the formation, characteristics and uses of rocks; the changes and uses of soil; natural resources 
used by humankind; and natural forces within the Earth. Concepts and topics related to water consist 
of the water cycle; the nature of oceans and their effects on water and climate; and the location of 
water, its distribution, characteristics, and effect of and influence on human activity. The air is broken 
down into composition and structure of the atmosphere (including energy transfer); the nature of 
weather; common weather hazards; and air quality and climate. The Earth in space consists of setting 
of the Earth in the solar system; the setting and evolution of the solar system in the universe; tools 
and technology that are used to gather information about space; apparent daily motions of the Sun, the 
Moon, the planets and the stars; rotation of the Earth about its axis, and the Earth’s revolution around 
the Sun; and tilt of the Earth’s axis that produces seasonal variations in the climate. 

Physical Science 

The physical science component relates to basic knowledge and understanding concerning the structure 
of the universe as well as the physical principles that operate within it The major sub-topics probed 
are matter and its transformations, energy and its transformations, and the motion of things. Matter 
and its transformations are described by diversity of materials (classification and types and the 
particulate nature of matter); temperature and states of matter; properties and uses of material 
(modifying properties, synthesis of materials with new properties); and resource management Energy 
and its transformations involve different forms of energy; energy transformations in living systems, 
natural physical systems, and artificial systems constructed by humans; and energy sources and use, 
including distribution, energy conversion, and energy costs and depletion. Motion is broken down into 
an understanding of frames of reference; force and changes in position and motion; action and reaction; 
vibrations and waves as motion; general wave behavior; electromagnetic radiation; and the interactions 
of electromagnetic radiation with matter. 

Life Science 

The fundamental goal of life science is to attempt to understand and explain the nature and function 
of living things. The major concepts assessed in life science are change and evolution, cells and their 
functions (not at grade 4), organisms, and ecology. Change and evolution includes diversity of life 
on Earth; genetic variation within a species; theories of adaptation and natural selection; and changes 
in diversity over time. Cells and their functions consists of information transfer; energy transfer for 
the construction of proteins; and communication among cells. Organisms are described by 
reproduction, growth and development; life cycles; and functions and interactions of systems within 
organisms. The topic of ecology centers on the interdependence of life — populations, communities, 
and ecosystems. 

SOURCE: Science Framework for the 1996 National Assessment of Educational Progress, (Washington, DC: National 

Assessment Goveming Board, 1993). 
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TABLE B.1 


Distribution of Assessment Time by Field of Science 


State Assessment 



Earth 


Physical 


Life 


Actual 


Recommended 


Actual 


Recommended 


Actual 


Recommended 



Grade 4 


33% 


33% 


34% 


33% 


33% 


33% 


Grade 8 


30% 


30% 


30% 


30% 


40% 


40% 


Grade 12 


33% 


33% 


33% 


33% 


34% 


33% 
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Description of Knowing and Doing Science 



Conceptual Understanding 

Conceptual understanding includes the body of scientific knowledge that students draw upon when 
conducting a scientific investigation or engaging in practical reasoning. Essential scientific concepts 
involve a variety of information including facts and events the student learns from science instruction 
and experiences with the natural environment and scientific concepts, principles, laws, and theories 
that scientists use to explain and predict observations of the natural world. 

Scientific Investigation 

Scientific investigation probes students’ abilities to use the tools of science, including both cognitive 
and laboratory tools. Students should be able to acquire new information, plan appropriate 
investigations, use a variety of scientific tools, and communicate the results of their investigations. 

Practical Reasoning 

Practical reasoning probes students’ ability to use and apply science understanding in new, real-world 
^yplications. 

SOURCE: Science Framework for the 1996 National Assessment of Educational Progress. (Washington, DC: National 
Assessment Governing Board, 1993). 
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TABLE B.2 


Distribution of Assessment Time by Knowing and Doing 
Science 



Conceptual Understanding 


Scientific Investigation 


Practical Reasoning 


Actual 


Recommended 


Actual 


Recommended 


Actual 


Recommended 



Grade 4 


45% 


45% 


38% 


45% 


17% 


10% 


Grade 8 


45% 


45% 


29% 


30% 


26% 


25% 


Grade 12 


44% 


45% 


28% 


30% 


28% 


25% 



104 



105 



THE NAEP 1996 STATE ASSESSMENT IN SCIENCE 



Montana 



The two overarching dimensions are described and accounted for by Figure B.3 
and Table B.3, which describe the nature of science and the themes that transcend the 
scientific disciplines. 
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FIGURE B.3 




Description of Overarching Domains 


state Assessroeiri 





The Nature of Science 

The nature of science incorporates the historical development of science and technology, the habits 
of mind that characterize these fields, and methods of inquiry and problem-solving. It also 
encompasses the nature of technology that includes issues of design, ^plication of science to 
real-world problems, and trade-offs or compromises that need to be made. 

Themes 

Themes are the “big ideas” of science that transcend the various scientific disciplines and enable 
students to consider problems with global implications. The NAEP science assessment focuses on 
three themes: systems, models, and patterns of change. 

• Systems are complete, predictable cycles, structures or processes occurring in natural 
phenomena. Students should understand that a system is an artificial construction 
created to represent, or explain a natural occurrence. Students should be able to identify 
and define the system boundaries, identify the components and their interrelationships 
and note the inputs and outputs to the system. 

• Models of objects and events in nature are ways to understand complex or abstract 
phenomena. As such they have limi ts and involve simplifying assumptions but also 
possess generalizability and often predictive power. Students need to be able to 
distinguish the idealized model from the phenomenon itself and to understand the 
limitations and simplified assumptions that underlie scientific models. 

• Patterns of change involve students’ recognition of patterns of similarity and differences, 
and recognize how these patterns change over time. In addition, students should have 
a store of common types of patterns and transfer their understanding of a famili ar pattern 
of change to a new and unfa miliar one. 



SOURCE: Science Framework for the 1996 National Assessment of Educational Progress. (Washington, DC: National 
Assessment Governing Board, 1993). 
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Distribution of Assessment Time by Overarching Domains 


State Assessment 



Nature of Science 


Themes 


Actual 


Recommended 


Actual* 


Recommended 



Grade 4 


19% 


>15% 


53% 


33% 


Grade 8 


21% 


>15% 


49% 


50% 


Grade 12 


31% 


>15% 


55% 


50% 



* Several of the hands-on tasks were classified as themes. 

SOURCE: Science Framework for the 1996 National Assessment of Educational Progress. (Washington, DC: National 
Assessment Governing Board, 1993). 



B.2 The Assessment Design 

The state science assessment used booklets that were identical to those used at 
grade 8 for the national assessment. Each student in the state assessment program in 
science received a booklet containing six sections. Three of these sections were 
blocks’ of cognitive questions that assessed the knowledge and skills outlined in the 
framework, and the other three sections were sets of background questions. Two of the 
three cognitive sections were paper-and-pencil, and the third section consisted of a 
hands-on task with related questions. In the state assessment at grade 8, students were 
allowed 30 minutes to complete each cognitive block. (For the national assessment, 
students at grades 8 and 12 were allowed 30 minutes, while students at grade 4 were 
given cognitive blocks that each required 20 minutes to complete.) 

At each grade level there were 15 different sections or blocks of cognitive 
questions, but each student’s booklet contained only three of these blocks of items. 
Every block Oonsisted of both multiple-choice and constructed-response questions. Short 
constructed-response questions required a few words or a sentence or two for an answer 
(e.g., briefly stating how nutrients move from the digestive system to the tissues) while 
the extended constructed-response questions generally required a paragraph or more 
(e.g., outlining an experiment to test the effect of increasing the amount of available food 
on the rate of increase of the hydra population). Some constructed-response questions 
also required diagrams, graphs, or calculations. It was expected that students could 
adequately answer the short constructed-response questions in about 2 to 3 minutes and 
the extended constructed-response questions in about 5 minutes. 



blocks** are collections of questions grouped, in part, according to the amount of time required to answer them. 
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Other features were built into the blocks of cognitive questions. Four of the blocks 
were hands-on tasks in which students were given a set of equipment and asked to 
conduct an investigation and answer questions relating to the investigation. Every 
student was assessed on one of these four blocks. A second feature was the inclusion 
of three theme blocks — one assessing systems, one assessing models, and one assessing 
patterns of change. For example, students were shown a simplified model of part of the 
Solar System with a brief description, and then asked a number of questions based on 
this scenario. Theme blocks were randomly placed in booklets, but not in all booklets. 
No student received more than one theme block. 

Each booklet in the assessment also included three sets of student background 
questions. The Erst, consisting of general background questions, asked students about 
such things as mother’s and father’s level of education, reading materials in the home, 
homework, and school attendance. The second, consisting of science backgroimd 
questions, asked students questions about their classroom learning activities such as 
hands-on exercises, courses taken, use of specialized resources such as computers, and 
views on the utility and value of science. Students were given five minutes to complete 
each of these questionnaires. The third set contained five questions about students’ 
motivation to do well on the assessment, their perception of the difficulty of the 
assessment, and their familiarity with the types of cognitive questions asked. This 
section -took three minutes or less to complete. 

Using information gathered from the field test, the booklets were carefully 
constructed to balance time requirements for the question types in each block. For more 
information on the design of the assessment, the reader is referred to Appendix C. 
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B.3 Usage of Question Types 

The data in Table B.4 reflect the number of questions by type and by grade level 
for the 1996 assessment. One hundred and sixty-five multiple-choice (MC), 219 short 
constructed-response (SCR), and 59 extended constracted-response (ECR) questions 
make up the assessment, giving a total of 443 unique questions in the pool. Some of 
these questions were used at more than one grade level; thus, the sum at each grade level 
is greater than the total number of unique questions. For the state assessment program 
at grade 8, students responded to subsets (determined by booklet) of 74 multiple-choice 
questions, 100 short constracted-response questions, and 20 extended 
constructed-response tasks. 
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TABLE B.4 


Distribution of Items by Question Type 


state Assessment 



Grade 4 


Grade 8 


Grade 12 


MC 


SRC 


ERC 


MC 


SRC 


ERC 


MC 


SRC 


ERC 



Grade 4 only 


42 


57 


12 














Grades 4 & 8 overlap 


9 


16 


4 


9 


16 


4 








Grade 8 only 








44 


58 


13 








Grades 8 & 12 overlap 








21 


26 


3 


21 


26 


3 


Grade 12 only 














49 


62 


27 


TOTAL by grade 


51 


73 


16 


74 


100 


20 


70 


88 


30 



MC — nmldple’Choice questions; SRC — short constructed-response questions; ERC — extended constructed-response 
questions 
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APPENDIX C 



Technical Appendix: The Design, 
Implementation, and Analysis of the 1996 
State Assessment Program in Science 

C.1 Overview 

The purpose of this appendix is to provide technical information about the 1996 
state assessment program in science. It describes the design of the assessment and gives 
an overview of the steps used to implement the program, from the planning stages 
through the analysis of the data. 

This appendix is one of several documents that provide technical information 
about the 1996 state assessment program. Readers interested in more details are referred 
to the Technical Report of the NAEP 1996 State Assessment Program in Science. 
Theoretical information about the models and procedures used in NAEP can be found 
in the special NAEP-related issue of the Journal of Educational Statistics (Summer 
1992A^olume 17, Number 2) as well as previous national technical reports. 

Educational Testing Service (ETS) was awarded the cooperative agreement for the 
1996 NAEP programs, including the state assessment program. ETS was responsible 
for overall management of the programs as well as for development of the overall 
design, the cognitive questions and questionnaires, data analysis, and reporting. National 
Computer Systems (NCS) was a subcontractor to ETS on both the national and state 
NAEP programs. NCS was responsible for printing, distributing, and receiving all 
assessment materials, and for scanning and scoring the assessments. The National 
Center for Education Statistics (NCES) awarded a separate cooperative agreement to 
Westat, Inc., for handling all aspects of sampling and field operations for the national 
and state assessments for 1996. 
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Organization of the Technical Appendix 

This appendix has the following organization: 

• Section C.2 provides an overview of the design of the 1996 state 
assessment program in science. 

• Section C.3 discusses the partially-balanced incomplete block (PBIB) 
spiral design used to assign cognitive questions to assessment booklets 
and assessment booklets to students. 

• Section C.4 outlines the sampling design used for the 1996 state 
assessment program. 

• Section C.5 summarizes Westat’s field administration procedures. 

• Section C.6 describes the flow of the data fix>m receipt at NCS through 
data entry and professional scoring. 

• Section C.7 summarizes the procedures used to weight the assessment 
data and to obtain estimates of the sampling variability of subpopulation 
estimates. 

• Section C.8 describes the initial analyses performed to verify the quality 
of the data. 

• Section C.9 describes the item response theory scales and the overall 
science composite scale created for the find andyses of the state 
assessment program data. 

• Section C.IO provides an overview of the linking of the scded results 
fix>m the state assessment program in science to those firom the nadond 
assessment. 

C.2 Design of the NAEP 1996 State Assessment Program in Science 

The design for the state assessment program in science included the following 
major aspects: 

• Participation at the jiuisdiction level was voluntary, except for a few 
jurisdictions for which NAEP has been mandated by the state legislature. 

• Students from public and nonpublic schools were assessed. Nonpublic 
schools included Catholic schools, other religious schools, and private 
schools. Separate representative samples of public and nonpublic 
schools were selected in each participating jurisdiction and students were 
randomly sampled within schools. The size of a jurisdiction’s nonpublic 
school samples was proportiond to the percentage of students in that 
jurisdiction attending such schools. 
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• The eighth-grade science assessment instruments used for the state 

assessment program and the national assessment consisted of 15 blocks 
of questions, of which 4 were hands-on tasks. Each block could contain 
a mixture of question types — constructed-response or multiple-choice 
— that was determined by the nature of the task. In addition, the 
constmcted-response questions were of two types: short 

constructed-response questions required students to respond to a 

question with a few words or a few sentences, while extended 
constructed-response questions required students to respond to a 

question with a paragraph or more, sometimes including graphs or 
calculations. The hands-on tasks were similar to laboratory exercises. 
Each student was given 2 of the 11 cognitive blocks of questions, and 
one of the four hands-on blocks. 

• A complex form of matrix sampling called a partially balanced 
incomplete block (PBIB) spiraling design was used. With PBIB 
spiraling, students in an assessment session received different booklets 
containing 3 of the 15 blocks. This provided for greater science content 
coverage without imposing an undue testing burden by administering 
an identical set of questions to each student. 

• Sets of background questions given to the students, the students’ science 
teachers, and the principals or other school administrators provided a 
variety of contextual information. The background questioimaires for 
the state assessment program were identical to those used in the national 
eighth-grade assessment. 

• The total assessment time for each smdent was approximately two hours, 
including cleanup and collection of materials from hands-on tasks. Each 
assessed student was assigned a science booklet that contained 3 of the 
15 blocks of science questions requiring 30 minutes each (including a 
hands-on task block in the last position), followed by a 5-minute general 
background questionnaire, a 5-minute science background questionnaire, 
and a 3-minute motivation questionnaire. Thirty-seven different 
booklets were assembled. 

• The assessments were administered in the five-week period between 
January 29 and March 4, 1996. One-fourth of the schools in each 
jurisdiction were assessed each week throughout the first four weeks. 
Because of the severe weather throughout much of the country, the fifth 
week was used for regular testing as well as for makeup sessions. 

• Data collection was, by law, the responsibility of each participating 
Jurisdiction. Security and uniform assessment administration were high 
priorities. Extensive training of state assessment personnel was 
conducted to assure that the assessment would be administered under 
standard, uniform proc^ures. For Jurisdictions that had participated in 
previous NAEP state assessments, 25 percent of both public and 
nonpublic school assessment sessions were monitored by Westat staff. 
For the jurisdictions new to NAEP, 50 percent of both public and 
nonpublic school sessions were monitored. 
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C.3 Assessment Instruments 

The student assessment booklets contained six sections and included both cognitive 
and noncognitive questions. The assembly of cognitive questions into booklets and their 
subsequent assignment to assessed students were determined by a matrix sampling 
design using a variant of a balanced incomplete block design (BIB), with spiraled 
administration. Each assessed student received a booklet containing 3 of the 15 
cognitive blocks according to a design that ensured that each block was administered to 
a representative sample of students within each jurisdiction. The third cognitive block 
was always one of the four hands-on blocks; this requirement meant that the BIB was 
partially balanced (PBIB). 

In addition to two 30-minute sections of cognitive questions and the 30-minute 
performance task section, each booklet included two 5-minute sets of general and science 
background questions designed to gather contextual information about students, their 
experiences in science, and their attitudes toward the subject, and one 3-minute section 
of motivation questions designed to gather information about the student’s level of 
motivation while taking the assessment 

In addition to the student assessment booklets, three other instmments provided 
data relating to the assessment: a science teacher questionnaire, a school characteristics 
and policies questionnaire, and an SD/LEP student questionnaire (for students 
categorized as students with disabilities or with limited English proficiency). 

The teacher questionnaire was administered to the science teachers of the 
eighth-grade students participating in the assessment. The questionnaire consisted of 
three sections and took approximately 20 minutes to complete. The first section focused 
on the teacher’s general background and experience; the second, on the teacher’s 
background related to science; and the third, on classroom information about science 
instmction. 

The school characteristics and policies questionnaire was given to the principal 
or other administrator in each participating school and took about 20 minutes to 
complete. The questions asked about the principal’s background and experience, school 
policies, programs, and facilities, and the demographic composition and background of 
the students and teachers. 

The SD/LEP student questionnaire was completed by the staff member most 
familiar with any student selected for the assessment who was classified in either of two 
ways: students with disabilities (SD) had an Individualized Education Plan (lEP) or 
equivalent special education plan (for reasons other than being gifted and talented); 
students with limited English proficiency were classified as LEP students. The 
questionnaire took approximately three minutes to complete and asked about the student 
and the special programs in which the student participated. It was completed for all 
selected SD or LEP students regardless of whether or not they participated in the 
assessment. Selected SD or LEP students participated in the assessment if they were 
determined by the school to be able to participate, considering the terms of their lEP 
and accommodations provided by the school or by NAEP. 
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C.4 The Sampling Design 

The sampling design for NAEP is complex, in order to minimize burden on 
schools and students while maximizing the utility of the data. For further details see the 
Technical Report for the NAEP 1996 State Assessment Program in Science. The target 
populations for the state assessment program in science consisted of eighth-grade 
students enrolled in either public or nonpublic schools. The representative samples of 
public school eighth graders assessed in the state assessment program came from about 
100 schools (per grade) in each jurisdiction. If a jurisdiction had fewer than 100 public 
schools with a particular grade, all or almost all schools were asked to participate. If a 
jurisdiction had smaller numbers of students in each school than expected, more than 
100 schools were selected for participation. The nonpublic school samples differed in 
size across the jurisdictions, with the number of schools selected proportional to the 
nonpublic school enrollment within each jurisdiction. Typically, about 25 nonpublic 
schools were included for each jurisdiction. The school samples in each state were 
designed to produce aggregate estimates for the jurisdiction and for selected 
subpopulations (depending upon the size and distribution of the various subpopulations 
within the jurisdiction) and also to enable comparisons to be made, at the jurisdiction 
level, between administration of assessment tasks with monitoring and without 
monitoring. The public schools were stratified by urbanization, percentage of Black and 
Hispanic students enrolled, and median household income within the ZIP code area of 
the school. The nonpublic schools were stratified by type of control (Catholic, 
private/other religious, other nonpublic), metropolitan status, and enrollment size per 
grade. 

The national and regional results are based on nationally representative samples 
of eighth-grade students. The samples were selected using a complex multistage 
sampling design involving the sampling of students from selected schools within selected 
geographic areas across the country. The sample design had the following stages: 

(1) selection of geographic areas (a coimty, group of coimties, or a 
metropolitan statistical area); 

(2) selection of schools (public and nonpublic) within the selected areas; and 

(3) selection of students within selected schools. 
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Each selected school that participated in the assessment, and each student assessed, 
represent a portion of the population of interest To make valid inferences from student 
samples to the respective populations from which they were drawn, sampling weights 
are needed. Discussions of sampling weights and how they are used in analyses are 
presented in sections C.7 and C.8. 

The state results provided in this report are based on state-level samples of 
eighth-grade students. The samples of both public and nonpublic school students were 
selected based on a two-stage sample design that entailed selecting students within 
schools. The first-stage samples of schools were selected with a probability proportional 
to the eighth-grade enrollment in the schools. Special procedures were used for 
jurisdictions with many small schools and for jurisdictions with a small number of 
schools. As with the national samples, the state samples were weighted to allow for 
valid inferences about the populations of interest. 

The results presented for a particular jurisdiction are based on the representative 
sample of students who participated in the 1996 state assessment program. The results 
for the nation and regions of the coimtry are based on the nationally and regionally 
representative samples of students who were assessed as part of the national NAEP 
program. Using the national and regional results from the 1996 national assessment 
was necessary because of the volimtary nature of the state assessment program. Because 
not every state participated in the program, the aggregated data across states did not 
necessarily provide representative national or regional results. 

In most jurisdictions, up to 30 students were selected from each school, with the 
aim of providing an initial sample size of approximately 3,000 public school students 
per jurisdiction for the eighth grade. The student sample size of 30 for each school was 
chosen to ensure that at least 2,000 public school students participated from each 
jurisdiction, allowing for school nonresponse, exclusion of students, inaccuuracies in the 
measures of enrollment, and student absenteeism from the assessment. In jurisdictions 
with fewer schools, larger numbers of students per school were often required to ensure 
initial samples of roughly 3,000 students. In certain jurisdictions, all eligible eighth 
graders were targeted for assessment. Jurisdictions were given the option to reduce the 
expected student sample size in order to reduce testing burden and the number of 
multiple-testing sessions for participating schools. At grade 8, four jurisdictions (Alaska, 
Delaware, Hawaii, and Rhode Island) elected to exercise this option. Using this option 
can involve compromises such as higher standard errors and accompanying loss of 
precision. 
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In order to provide for wider inclusion of students with disabilities and limited 
English proficiency, the 1996 state assessments both in mathematics and science 
involved dividing the sample of students at each grade level into two subsamples, 
referred to as SI and S2. SI provided continuity with the 1992 mathematics assessment 
and thus allowed for the reporting of performance over time by using the same exclusion 
criteria for students with disabilities and limited English proficiency as was used in that 
assessment. S2 provided for wider inclusion of students with disabilities and limited 
English proficiency by incorporating new exclusion rules. 

The NAEP 1996 science assessment was developed using a new framework, and 
therefore does not include reporting of performance over time. However, in order to 
make the sample design identical for both subjects at the state level, both SI and S2 
were included. For further discussion, see the NAEP 1996 Science Report Card. 

The 1996 national assessment in science used only the more inclusive S2 
guidelines for student participation. The national assessments in mathematics and 
science both involved an additional subsample, S3, in which accommodations were 
provided for certain students with disabilities or limited English proficiency, again in 
order to make NAEP more inclusive. 

For the national science assessment, scaling and analysis procedures (discussed in 
sections C.8 through C.IO) were applied to all assessed students from S2. For the state . 
science assessment, scaling and analysis procedures were applied to a combination of 
all assessed students from S2 and students who were not identified as SD or LEP from 
SI. This combination of segments of the SI and S2 subsamples maximized the 
usefulness of available data while allowing for comparisons to the student population in 
the national sample. This combination, referred to as the “reporting sample,” was the 
sample used to link the state science assessment to the national assessment (see Section 
C.IO), as well as for sc aling and reporting. 

Additional analyses will be conducted on the national samples to study the effects 
of changing the exclusion mles and allowing the use of accommodations. Preliminary 
discussion can be found in the NAEP 1996 Science Report Card and the NAEP 1996 
Mathematics Report Card', more detailed discussion will follow in future NAEP 
publications. 
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C.5 Field Administration 

Administering the 1996 program required collaboration among staff in the 
participating jurisdictions and schools and the NAEP contractors, especially Westat, the 
rield administration contractor. 

Each jurisdiction volunteering to participate in the 1996 state assessment program 
appointed a state coordinator to serve as liaison between NAEP staff and the 
participating schools. In addition, Westat hired and trained a supervisor for each 
jurisdiction and six field managers who worked with groups of jurisdictions. The state 
supervisors worked with the state coordinators, overseeing assessment activities, training 
school district personnel to administer the assessment, and coordinating quality control 
monitoring efforts. Each field manager worked with the state coordinators from seven 
to eight jurisdictions and the state supervisors assigned to those jurisdictions. An 
assessment administrator prepared and conducted the assessment session in one or more 
schools. These individuals were usually school or district staff and were trained by 
Westat. Westat also hired and trained three to five quality control monitors in each 
jurisdiction. For jurisdictions that had previously participated in the state assessment 
program, 25 percent of the public and nonpublic school sessions were monitored. For 
jurisdictions new to the program, 50 percent of all sessions were monitored. The 
assessment sessions were conducted during a five-week period beginning in late January 
1996. 

C.6 Materials Processing, Professional Scoring, and Database 
Creation 

Upon completion of each assessment session, school personnel shipped the 
assessment booklets and forms to NCS for professional scoring, entry into computer 
files, and checking. The files were then sent to ETS for creation of the database. 

After NCS received all appropriate materials from a school, they were forwarded 
to the professional scoring area where the responses to the constructed-response question 
were evaluated by trained staff using guidelines prepared by ETS. Each 
constructed-response question had a unique scoring guide that defined the criteria to be 
used in evaluating students’ responses. The extended constructed-response questions 
were evaluated with four- or five-level rubrics. Some of the short constructed-response 
questions were rated according to three-level mbrics that permit partial credit to be 
given; other short constructed-response questions were scored as either acceptable or 
unacceptable. 

For the national science assessment and the state assessment program in science, 
over 4.1 million constructed responses were scored. This figure includes rescoring to 
monitor interrater reliability. The overall percentage of agreement between scorers for 
the reliability sample was 93 percent for the tasks in the cognitive blocks and 95 percent 
for the hands-on tasks. 
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Data transcription and editing procedures were used to generate the disk and tape 
files containing various assessment information, including the sampling weights required 
to malcR valid statistical inferences about the population from which the state assessment 
program sample was drawn. Prior to analysis, the data from these files underwent a 
quality control check at ETS. The files were then merged into a comprehensive, 
integrated database. 

C.7 Weighting and Variance Estimation 

A complex sample design was used to select the students who were assessed in 
each of the participating jurisdictions. The properties of a sample selected through a 
complex design are very different from those of a simple random sample in which every 
student in the target population has an equal chance of selection and in which the 
observations from different sampled students can be considered to be statistically 
independent of one another. Therefore, the properties of the sample for the complex 
state assessment program design were taken into account during the analysis of the 
assessment data. 

One way that the properties of the sample design were addressed was by using 
sampling weights to account for the fact that the probabilities of selection were not 
identical for all students. All population and subpopulation characteristics based on the 
state assessment program data used sampling weights in their estimation. These weights 
included adjustments for school and student nonresponse. 

Not only must appropriate estimates of population characteristics be derived, but 
appropriate measures of the degree of uncertainty must be obtained for those statistics. 
One component of uncertainty results from sampling variability, which is a measure of 
the dependence of the results on the particular sample of students actually assessed. 
Because of the effects of cluster selection (schools are selected first, then students are 
selected within those schools), observations made on different students carmot be 
assumed to be independent of each other (and, in fact, are generally positively 
correlated). As a result, classical variance estimation formulas will produce incorrect 
results. Thus, a jackknife variance estimation procedure that accounts for the 
characteristics of the sample was used for all analyses. 

Jackknife variance estimation provides a reasonable measure of uncertainty for any 
statistic based on values observed without error. Statistics such as the percentage of 
students correctly answering a given question meet this requirement, but other statistics 
based on estimates of student science performance, such as the average science scale 
score of a subpopulation, do not. Because each student typically responds to relatively 
few questions from a particular field of science (e.g., physical or life science), a 
nontrivial amount of imprecision exists in the measurement of the scale score of a given 
student. This imprecision adds another component of variability to statistics based on 
estimates of individual performance. 



1 18 



THE NAEP 1996 STATE ASSESSMENT IN SCIENCE 



117 



Montana 



C.8 Preliminary Data Analysis 

After the computer files of student responses were received and merged into an 
integrated database, all cognitive and noncognitive questions were subjected to an 
extensive item analysis. For each cognitive question, this analysis yielded the number 
of respondents, the percentage of responses in each category, the percentage who omitted 
the question, the percentage who did not reach the question, and the correlation between 
the question score and the block score. In addition, the item analysis program provided 
summary statistics for each block of cognitive questions, including a reliability (internal 
consistency) coefficient. These analyses were used to check the scoring of the questions, 
to verify that the difficulty level of the questions was appropriate, and to ensure that 
students had received adequate time to complete the assessment. The results were 
reviewed by knowledgeable project staff in search of aberrations that might signal 
unusual results or errors in the database. 

The question and block-level analyses were conducted using rescaled versions of 
the final sampling weights provided by Westat (see Section C.7). The rescaling was 
implemented for each Jurisdiction. The sum of the s amplin g weights for the public 
school students within each Jurisdiction was constrained to be equal. The same 
transformation was applied to the weights of the nonpublic school students in that 
Jurisdiction. The sum of the weights for each of the Department of Defense (DoDEA) 
samples (i.e., DDESS and DoDDS) was constrained to equal the same value as the 
public school students in other Jiuisdictions. Using rescaled weights does not alter the 
value of statistics calculated separately within each Jurisdiction. However, for statistics 
obtained from samples that combine students from different Jurisdictions, using rescaled 
weights results in a roughly equal contribution of each jurisdiction's data to the final 
value of the estimate. Equal contribution of each Jurisdiction's data to the results of the 
item response theory (IRT) scaling was viewed as a desirable outcome. The original 
final sampling weights provided by Westat were used in reporting. 

Additional analyses that compared the data from the monitored sessions with those 
from the unmonitored sessions were conducted to determine the comparability of the 
assessment data from the two types of administrations. Difterential item functioning 
(DIF) analyses were carried out using the national assessment data. DIF analyses 
identified questions that were differentially difficult for various subgroups, so that these 
questions could be re-examined for their fairness and their appropriateness for inclusion 
in the scaling process. 
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C.9 Scaling the Assessment Questions 

The p rimar y analysis and reporting of the results from the state tissessment 
program used item response theory (IRT) scale-score models. Scaling models quantify 
a respondent’s tendency to provide correct answers to the domain of questions that 
contribute to a scale tis a function of a parameter called performance, estimated by a 
scale score. The scale scores can be viewed as a summary measure of performance 
across the domain of questions that make up the scale. Three distinct IRT models were 
used for scaling: three-parameter logistic models for multiple-choice questions; 
two-parameter logistic models for short constructed-response questions that were scored 
correct or incorrect; and generalized partial credit models for short and extended 
constructed-response questions that were scored on a multipoint scale (i.e., greater than 
two levels). 

Three distinct scales were created for the state assessment program in science to 
summarize eighth-grade students’ abilities according to the three de&ied fields of 
science (earth, physical, and life). These scales were defined identically to, but 
separately from, those used for the scaling of the national NAEP eighth-grade science 
data. Although the questions composing each scale were identical to those used in the 
national tissessment progreim, the item parameters for the state tissessment program 
scales were estimated from combined public school data from the jurisdictions 
participating in the state tissessment program.’ Item parameter estimation was carried 
out on an item calibration subsample. The calibration subsample consisted of a sample 
drawn from approximately 25 percent sample of all available public school data. To 
ensure equal representation in the scaling process, each jurisdiction contributed the same 
number of students to the item calibration sample. Within each jurisdiction, 25 percent 
of the calibration sample wtis taken from monitored administrations while the remaining 
75 percent came from unmonitored administrations. 

Wi thin each scale, the estimates of the empirical item characteristic functions were 
compared with the theoretical curves to determine how well the IRT model fit the 
observed data. For correct-incorrect questions, nonmodel-btised estimates of the 
expected proportions of correct responses to each question for students with various 
levels of scale proficiency were compared with the fitted item response curve. For the 
short and extended partial-credit constructed-response questions, the comparisons were 
beised on the expected proportions of students with various levels of scale proficiency 
who achieved each score level. In general, the scaling models fit the question-level 
results well. 



’ For the creatioii of scales, schools from the DoDEA juiisdictions are considered nonpublic, so the responses from these 
students were not included in the item calibration sample. 
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Using the item parameter estimates, estimates of various population statistics were 
obtained for each jurisdiction. The NAEP methods use random draws (“plausible 
values”) from estimated proficiency distributions for each student to compute population 
statistics. Plausible values are not optimal estimates of individual student proficiencies; 
instead, they serve as intermediate values to be Used in estimating population 
characteristics. Under the assumptions of the scaling models, these population estimates 
will be consistent, in the sense that the estimates approach the model-based population 
values as the sample size increases, which would not be the case for population estimates 
obtained by aggregating optimal estimates of individual performance. 

The 1996 science assessment was developed using a new framework. Because it 
was not appropriate to compare results from the 1996 assessment to those of previous 
NAEP science assessments, no attempt was made to link or align scores on the new 
assessment to those of previous assessments. Therefore, it was necessary to establish a 
new scale for reporting. Earlier NAEP assessments (such as the current mathematics 
assessment and the 1994 reading assessment) were developed with a cross-grade 
framework, in which the trait being measured is conceptualized as cumulative across the 
grades of the assessment. This concept was reflected in the scaling. The score scales 
developed for these assessments were cross-grade scales on a single 0-500 scale for all 
three grades in the assessment. 

In 1993, the National Assessment Governing Board (NAGB) determined that 
future NAEP assessments should be developed using within-grade frameworks. This 
removes' the constraint that the trait being measured is cumulative, and there is no need 
for overlap of questions across grades. Consistent with this view, NAGB also declared 
that scaling be performed within-grade. Any items which happened to be the same 
across grades in the assessment were scaled separately for each grade, thus allowing 
common items, potentially, to function differently in the separate grades. The 1994 
NAEP history and geography assessments were developed and scaled within-grade. 
After scaling, the scales were aligned so that grade 8 had a higher mean than did grade 
4, and grade 12 had a higher mean than grade 8. The results were reported on a final 
0-500 scale that looked si mil ar to those used in mathematics and reading, in spite of the 
differences in development and sc aling . This definition of the reporting scale was a 
source of potential confusion and misinterpretation. 

The 1996 science assessment was also developed and scaled using within-grade 
procedures. A new reporting metric was adopted to differ from the O-to-500 repotting 
scales used in other NAEP subject areas in order to minimiz e confusion with other 
common test scales and to discourage cross-grade comparisons. For each grade in the 
national assessment, the mean for each field of science was set at 150 and the standard 
deviation was set at 35. First, the reporting metric was developed using data from the 
national assessment program; the results for the state assessment program were then 
linked to that scale using procedures described in Section C.IO. 
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In addition to the plausible values for each scale, a composite of the three fields 
of science scales was created as a measure of overall science performance; as for the 
individual fields of science scales, the mean of the composite scale was set to 150 with 
a standard deviation of 35.^ This composite was a weighted average of the plausible 
values for the three fields of science scales. The scales were weighted proportionally 
to the relative importance assigned to each field of science in the science framework (see 
Table B.l). The definition of the composite for the state assessment program was 
identical to that used for the national eighth-grade science assessments. 

C.1 0 Linking the State Results to the National Results 

A major purpose of the state assessment program was to allow each participating 
jurisdiction to compare its 1996 results with those for the nation as a whole and with 
those for the region of the country where it is located. For meaningful comparisons to 
be made between each jurisdiction and the relevant national sample, results from these 
two assessments had to be expressed in terms of a similar system of scale units. 

The results from the state assessment program were linked to those from the 
national assessment through linking functions determined by comparing the results for 
the aggregate of all students assessed in the state assessment program with the results 
for eighth-grade students within the National Linking Sample of the national NAEP. 
The National Linking Sample of the national NAEP is a representative sample of the 
population of all grade-eligible public school students within the aggregate of 43 
participating states and the District of Columbia. (Guam and the two DoDEA 
jurisdictions were not included in the National Linking Sample.) Specifically, the 
National Linking Sample for science consisted of all eighth-grade students in public 
schools in the states and the District of Columbia who were assessed in the national 
cross-sectional science assessment. 

A linear equating within each field of science scale was used to link the results 
of the state assessment program to the national assessment. For each scale, the adequacy 
of the linear equating was evaluated by comparing the distribution of science scale 
scores based on the aggregation of all assessed students at each grade from the 
participating states and the District of Columbia with the equivalent distribution based 
on the students in the National Linking Sample. In the estimation of these distributions, 
the students were weighted to represent the target population of public school students 
in the specified grade in the aggregation of the states and the District of Columbia. If 
a linear equating were adequate, the distribution for the aggregate of states and the 
District of Columbia and that for the National Linking Sample would have, to a close 
approximation, the same shape in terms of the skewness, kurtosis, and higher moments 
of the distributions. The only differences in the distributions allowed by linear equating 
would be in the means and variances. Generally, this has been found to be the case. 

Thus, each field of science scale was linked by matching the scale mean and 
standard deviation of the scale scores across all students in the state assessment 
(excluding Guam and the two DoDEA jurisdictions) to the corresponding mean and 
standard deviation across all students in the National Linking Sample. 

^ The national average of students in public and nonpublic schools combined is 150. The national average seen in the 
tables in this report is based on the average for public schools only (148). 
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APPENDIX D 



Teacher Preparation 

ecause teachers are key to improving science education, their background and 
professional development should be examined. Eighth-grade science teachers completed 
questionnaires about their background and training, including their experience, 
certification, undergraduate and graduate course work in science, and involvement in 
pre-service education. 

Consistent with procedures used throughout this report, the student was the unit 
of analysis. That is, the science teachers’ responses were linked to their students, and 
the data reported are the percentages of students taught by these teachers rather than the 
percentages of teachers. 

The tables in Appendix D represent only a few of the questions in the teacher 
questionnaire, and this small selection can give only a sketchy profile of the teachers.' 
A report scheduled to appear in early 1998 will explore more of the questions related 
to school and classroom policy and practices and should give a better picture of the 
nation’s teachers. 



' The interested tender obtain additional infonnation on teachers* characteristics and qualifications and the conditions 
under which they teach in SASS by Suae (NCES 96-312) from the 1993-94 Schools and Staffing Survey. 

URL: http’7/www.ed.gov/NCES/pubs/96312.html. 
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TABLE D.1 



Public School Teachers’ Reports on Their Highest Level of 
Education 



What is the highest academic 


Montana 


West 


Nation 


degree you hold? 


Percentage 



Bachelor’s degree 


71 ( 3.2) 


67 ( 5.9) 


55(42) 


Master’s degree 


28 (3.1) 


31 ( 5.9) 


34 ( 4.0) 


Education specialist’s or 
professional diploma 


or***) 


2(1^) 


9 ( 3.4). 


Doctorate or professional degree 


1 (0.2) 


or***) 


1 (0.5) 



The Standard errors of the statistics appear in parentheses. It can be said with about 95 percent confidence tha t, for each 
population of interest, the value for the entire population is within ± 2 standard errors of the estimate for the sample. In 
comparing two estimates, one must use the standard error of the difference (see Appendix A for details). **** Standard 
error estimates cannot be accurately determined. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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TABLE D.2 



Public School Teachers’ Reports on Their Major Fields of 
Study 



Montana 


West 


Nation 


Percentage 



What were your major fields of 
study? (multiple responses 
possible) 



Undergraduate 








Education or elementary education 


47 ( 4.3) 


29 ( 42) 


38 ( 3.7) 


Secondary education 


42 ( 4.4) 


36 ( 8.5) 


41 ( 4.5) 


Science education 


42 ( 3.9) 


33 ( 8.4) 


36(42) 


Life science 


56 ( 3.0) 


51 ( 5.7) 


43 ( 5.1) 


Physical science 


35 ( 3.8) 


16(6.8) 


19 ( 5.0) 


Earth science 


25 ( 3.6) 


20 ( 6.7) 


22 ( 4.1) 


Other 


32 (3.7) 


35 (10.4) 


35(4.7) 


Graduate 








Education or elementary education 


25 ( 2.7) 


36 ( 7.9) 


27 ( 3.8) 


Secondary education 


15 ( 3.0) 


17(4.8) 


26 ( 3.4) 


Science education 


31 ( 4.2) 


20 ( 4.8) 


28 ( 5.0) 


Life science 


19(3.3) 


9(32) 


10 ( 1.8) 


Physical science 


9(1.3) 


3(12) 


5(1.5) 


Earth science 


6(1.4) 


6 ( 2.5) 


9(2.4) 


Other 


27 ( 3.2) 


44 ( 5.8) 


42 ( 4.5) 


No graduate study 


30 ( 4.2) 


11 (3.5) 


13 ( 2.4) 



The standard errors of the statistics appear in parentheses. It can be said with about 95 percent confidence that, for each 
population of interest the value for the entire population is within ± 2 standard errors of the estimate for the sample. In 
comparing two estimates, one must use the standard error of the difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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TABLE D.3 


Public School Teachers* Reports on Their Teaching 
Certification 


State Assessment 



Montana 


West 


Nation 


Percentage 



What type of teaching certification 
do you have in this state in your main 
assignment field? 

1 don't have a certificate in 
my main assignment fieid. 


0 rn 


1 (*n 


1 (0.5) 


Certification by an accreditation 
body other than the state 


or^) 


O(^) 


0 (•***) 


Temporary, provisional, or 
emergency state certificate 


8 ( 1.8) 


5(2.9) 


4(1.3) 


Probationary state certificate 
(initial certificate) 


1 (0.1) 


1 


3(1.3) 


Regular or standard state certificate 


78 ( 2.6) 


89(4.7) 


79 ( 3.5) 


Advanced professional certificate 


13 ( 1.9) 


4(**^) 


13(3.0) 


Do you have teaching certifieation in any of the 
follo¥irlng areas that Is recognized by the state 
in ¥vhich you teach? (multiple responses possible) 
Elementary or middle/junior 
high school education 


48(3.2) 


77 ( 5.6) 


66 ( 5.9) 


Elementary science 


13 ( 2.5) 


45 (10.1) 


25 ( 4.3) 


Middledunior high school or 
secondary science 


84 ( 4.5) 


97(1.2) 


95 ( 1.6) 


Other 


17 ( 4.5) 


59 ( 9.6) 


51 ( 6.3) 



The standard errors of the statistics appear in parentheses. It can be said with about 95 percent confidence that, for each 
population of interest, the value for the entire population is within ± 2 standard errors of the estimate for the sample. In 
comparing two estimates, one must use the standard error of the difference (see Appendix A for d e tai l s). S tandar d 

error estimates cannot be accurately determined. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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TABLE D.4 



Public School Teachers’ Reports on Years of Teaching 
Experience 



Counting this year, how many years 


Montana 


West 


Nation 


have you . . . 


Percentage 



taught at either the elementary 
or secondary level? ^ 

2 years or less 


11 ( 3.2) 


7(2.6) 


9 (2.2) 


3-5 years 


20 ( 2.8) 


11 (3.3) 


9 ( 1.7) 


6-10 years 


20 ( 3.6) 


36 ( 7.4) 


22 (3.2) 


11-24 years 


29 ( 3.4) 


37 ( 6.6) 


36(4.1) 


25 years or more 


20 ( 3.4) 


10 ( 3.6) 


24 ( 3.2) 


taught seienee? ^ 
2 years or less 


12 ( 3.3) 


9(2.7) 


13(2.4) 


3-5 years 


22 ( 2.8) 


16 ( 4.6) 


11 (2.2) 


6-10 years 


20 ( 3.6) 


44(7.1) 


30(3.2) 


11-24 years 


30 ( 3.4) 


25 ( 4.8) 


26 ( 3.4) 


25 years or more 


16 ( 3.4) 


5(3.2) 


20 ( 3.0) 



The Standard e r rors of the statistics appear in parentheses. It can be said with about 95 percent confidence that, for each 
population of interest, the value for the entire population is within ± 2 standard errors of the estimate for the sample. In 
comparing two estimates, one must use the standmti error of the difference (see Appendix A for details). ^Teach^ were 
instructed to include part-time teaching experience, teachers were instructed to include full-time and part-time 
assignments, but not substitute assignments. 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 1996 Science 
Assessment 
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TABLE D.5 



Public School Teachers’ Reports on Recent Course Taking 



During the last two years, how 
many college or university courses 
have you taken in science or 
science education? 



Montana 



West 



Percentage 



Nation 



None 


39 ( 4.3) 


54 ( 6.3) 


59 ( 3.4) 


One 


20 ( 4.4) 


17(7.1) 


14 ( 2.8) 


Two 


21 ( 2.8) 


7(3.2) 


11 (2.4) 


Three or more 


20 ( 3.8) 


22 ( 7.0) 


16 ( 2.8) 



The standard e r rors of the statistics appear in parentheses. It can be said with about 95 percent confidence that, for each 
population of interest, the value for tte entire population is within ± 2 standard errors of the estimate for the sample. In 
comparing two estimates, one must use the standard error of the difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NA^), 1996 Science 
Assessment 
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TABLE D.6 




Public School Teachers^ Reports on Professional 
Development Activities 


State Assessment 



Montana 


West 


Nation 


Percentage 



During the past two years, have you taken 
eoUege or university courses in any of the 
foiiowing? 

Methods of teaching science 


15 (3.3) 


12 (2.9) 


12 (2.2) 


Biology/life science 


13(3.5) 


20 ( 6.6) 


14(2.7) 


Chemistry 


8 ( 2.6) 


6 ( 2.8) 


6(1.7) 


Physics 


6(1.7) 


10(3.0) 


8(1.8) 


Earth science 


12(3.2) 


9(3.6) 


9 ( 2.0) 


During the past five years, have you taken 
courses or participated in pro/bss/ona/ 
deveiopment activities in any of the foiiowing? 
Use of computers for data acquisition 


57 ( 3.8) 


40 ( 7.7) 


50 ( 4.6) 


Use of computers for data analysis 


48(3.7) 


52 ( 8.3) 


54 ( 4.4) 


Use of multimedia for science education 


35 ( 3.6) 


68 (6.2) 


54 ( 4.5) 


Laboratory management or safety 


22 ( 3.4) 


30 ( 5.3) 


28 ( 3.8) 


Integrated science instruction 


31 ( 3.7) 


70 ( 7.2) 


46 ( 4.2) 



The standaid enx>rs of the statistics appear in parentheses. It can be said with about 95 percent confidence that, for each 
population of interest, the value for the entire population is within ± 2 standard errors of the estimate for the sample. In 
con^aring two estimates, one must use the standard error of the difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NAW), 1996 Science 
Assessment 
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TABLE D.7 



Public School Teachers’ Reports on Professional 
Development 



During the last year, how much time 
in total have you spent in 
professional development 
workshops or seminars in science 
or science education? 



Montana 



West 



Percentage 



Nation 



None 


4(1.2) 


2(1.2) 


8(2.5) 


Less than six hours 


12 (3.1) 


7(2.1) 


16 ( 4.2) 


6-15 hours 


30 ( 2.4) 


19(5.9) 


19 (2.7) 


16-35 hours 


24 ( 2.7) 


26 ( 6.3) 


26 ( 4.1) 


More than 35 hours 


29 ( 3.4) 


46(7.7) 


31 ( 3.5) 



The standard errors of the statistics appear in parentheses. It can be said with about 95 percent confidence that, for each 
population of interest, the value for the entire population is within ± 2 standard eiiors of the estimate for the sample. In 
comparing two estimates, one must use the standard error of the difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NA^), 1996 Science 
Assessment 
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TABLE D.8 


Public School Teachers’ Reports on Membership in 
Professional Societies 


State Assewnent 



Do you belong to one or more 
professional organizattor^ related 
to science? 


Montana 


West 


Nation 


Percentage 




Yes 

No 


52 ( 3.3) 
48 ( 3.3) 


48 ( 5.9) 
52 ( 5.9) 


57 ( 4.5) 
43 ( 4.5) 



The standard errors of the statistics appear in parentheses. It can be said with about 95 percent confidence that, for each 
population of interest the value for the entire population is within ± 2 standard errors of the estimate for the sample. In 
con^aring two estimates, one must use the standard error of the difference (see Appendix A for details). 

SOURCE: National Center for Education Statistics, National Assessment of Educational Progress (NA^), 1996 Science 
Assessment 
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ERRATA NOTICE 



Date: December 29, 1997 

To: Participants in the NAEP 1996 Science State Assessment 

From: Nada Ballator 

Center for the Assessment of Educational Progress at Educational Testing Service 
1-800-223-0267 

Re: Replacement pages attached for NAEP 1996 Science State Reports, correcting 

error in national and regional data in Table 6.2 and associated text 



An error was recently discovered in the national and regional data presented in 
Table 6.2 of the 1996 science state reports. For all states and jurisdictions, the data 
are correct; however, incorrect national data made it necessary to recompute 
comparisons between state and national results. The error involved the student 
background item, “About how many books are in your home?” which is reported in the 
NAEP 1996 Science State Report in Table 6.2, as well as in the bullets comparing your 
jurisdiction with the nation. 

Attached to this memo are the two corrected pages to insert into your printed 
reports. If you received camera-ready copy of the NAEP 1996 science state report, we 
have also enclosed pages for insertion there. The pages are for Chapter 6 in the section 
on “Literacy Materials in the Home” which includes Table 6.2; they contain revised 
comparisons to national data, and revised national and regional data in the table. We 
apologize for the publication of inaccurate data, and for the extra effort its correction 
will cause you. 

The state science reports appear on the NCES web site 
(http://nces.ed.gov/naep). All affected reports on the web were corrected on 
December 17. There is now a Revised logo beside the reports on the Index of Results 
and Summary Data web page (http://nces.ed.gov/naep/rsdindex.shtml) and on the 
Current Assessment Results web page (http://nces.ed.gov/naep/naepl996.html), and 
an Errata Notice containing a brief description of the repair on the NAEP 1996 Science 
State Reports web page (http://nces.ed.gov/naep/96state/97499.shtml). 

Also on the web site, the student data tables for national science results for public 
schools have been revised. On the web page for NAEP 1996 Summary Data Tables, 
Student Data (http://nces.ed.gov/naep/tables96/index.shtml), you will see an Errata 
Notice describing the repair. Please alert anyone who may be using national 1996 
science student data to this revision concerning the raw variable, “How many books are 
in your home,” and the derived variable H0MEEN3, “Home environment - Articles 
(of 4) in home.” 

We very much regret the extra work that this error may have necessitated in your 
jurisdiction; we will redouble our efforts to prevent such things happening again. 
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